Rogue Tech Talks: May 2024

In computing, speed is all about delivering information from the storage devices (e.g., disks) to the peripheral devices (e.g., monitors) in real time. Shane Croft has produced a utility that addresses this issue. He demonstrated his program to the group tonight.

Over time, as users add data to a file, storage space must expand—and more disk blocks are necessary to store data for each file. The succeeding disk blocks are linked together in a daisy chain. Needless to say, not all the successive disk blocks are connected via sequential disk blocks; thus, when the user accesses a file that has exceeded its initial size (in disk blocks) boundary, the system must capture the address of each successive block, locate the block and link that disk block to the display chain in order to present a continuous flow of information to the user.

When a backup/restore process is performed, the disk blocks are restored in sequential disk blocks because the backup process knows how many disk blocks to make available based on the current size of the stored data.

Aside from performing a backup/restore process, there is an ad hoc process available to efficiently-reconfigure the data files. And, that’s the area of computing that Shane addressed.

Shane Croft presented the following tech talk about a disk cleaner he produced.

He demonstrated his process on his laptop, and explained that he had already set up files in the Registry that were ready to be moved. Shane then displayed the files that were set to be moved.

He also has a Find Large Files feature. He tells this program to announce all the files on his C: drive that are over a gigabyte in size. So if he needs to hunt-down large files, there they are.
He has an M2 SSD on his system and, when it’s running on batteries, the CPU is under-clocked. But, right now, it’s only been running for less than 10 seconds and it’s already scanned 200,000 files.

C: Wow!

Q: So, what’s the mechanism you use to scan?
A: The unicode version of the API is APIW so that’s for someone who runs it on a non-English system, it will still pull the unicode files up.

He also has a Find Duplicate File Finder. He actually made his own hash, based off of CRC32, but because of limitations of VB6 and 32-bit reading large files, he broke it up and actually sped it up by doing sections of the file, then combining them instead of doing one whole file.

The Find Duplicate Files and CRC32 that he made scans files amazingly-fast and pulls a proper hash file…so you can take two files and be off by one byte—or one bit—and it will detect it. So it’s super-fast code. He compared it to one of the fastest file duplicate scanners out there and he was able to replicate his own speed on it. Otherwise, when you do your standard MD4 MD5 CRC32, it’s not fast. But the way he did it—breaking it up and loading things into memory—it “hauls butt.”

Q: Are you hitting-up a bunch of threads to work with the individual pieces?
A: No, no. One thread. It’s not multi-threaded, because it reads the file into memory, breaks it up to a certain part, breaks them in and then it combines the instruction into one. So, instead of reading-in one large file, it reads it in small bits quicker. The API just works faster that way.

Q: And so, you get CRC for each chunk…
A: I made my own chunk…
Q: …and stitch them together?
A: So, imagine a large file and you’re taking pieces…
Q: yup.
A: You’re going to get a small CRC32 and I’m going to keep a list of the CRC32’s, and I’m going to base the CRC32 off of what’s in the list.

Two of the new pro features he added were: cleanup invalid Windows firewalls. So, when you open this up…if you had an app or anything that had a firewall rule but then you uninstall the app, this searches through all the firewall rules to see if any of the .exe’s are missing. So it pulls them up (in here) and lets you clean up all the old rules that are no longer needed.

The last tool: Registry Cleaners are snake oil. The Registry is a database and it gets bloated just like any other database does. So he made a Registry Compact And Reindex Tool which basically calls the Windows Reg SaveAPI, writes the Windows Registry Files out to brand-new ones, then outputs the rename function, you re-boot and it starts using the new files made.

With this function, you can hit Analyze and it will write out the Registry files to new files to tell you what the size will be and, if you want to do it, you can tell it to compact and you have to re-boot immediately after.

In this case, because it’s a fresh install, there’s about a 5.2% difference if he compacts; he’ll save about 8.7 MB. This particular system is not a highly-used system so the Registry hasn’t had a lot of read-and-deletes, but on a system that’s been around for a while, this can help speed things up.

You don’t actually want Registry Cleaners because they can break stuff. But, getting rid of the registry bloat, that’s a tool you don’t really find out there. So that’s one of the pro features as well.

And then, some of the other pro features he’s got planned include: This also can handle character paths greater than 256…not even Explorer can do this but his can delete files that have longer pathnames, thanks to the Unicode API. But you have to begin every single file path with \\?\ then c: whatever, and that’s how you can get to those longer path names.

One of the other pro features is: He’s going to have it check other profiles that are on the system. So it checks my other temp folder and my local stuff… so, say you’re on a server, remote desktop with 40 users…this will scan all 40 of them, instead of logging-in each person.

Q: Does it need special permissions?
A: It already declares the backup/restore permissions so it’s allowed access to all the files. So a program calling the right permissions is key to that. And the backup permissions allows it to
lead and the restore allows you to delete and move. It also handles symbolic links by checking to see if one exists and skips those.

One of the other pro features is: He’s going to have the ability for you to add your own custom registry or file locations and what file types and keys you want it to clean. So if you have a special app to create log files, you can put in your own thing to have it clean it.

He’s going to be adding features to the duplicate finder to move or delete duplicate sets of finds.

He was also thinking about offering a submit button for custom clean locations so you can submit it and then other people can do it, download it; things like that.

He was also thinking about some work he did years ago: He wrote a drive speedometer that tells you the current read and write speeds of all the drives. He was thinking about re-doing that and may be including a drive speed pro feature. So, not only can you see the space in real time but you can see how much reading/writing is happening on a disk drive.

Q: Can’t you do that now with Task Manager?
A: You can, but having a nice little monitor—just having two little progress bars…little widgets on the desktop…like that.

C: I love things like that!

Right now, this checks for 2404 possible things it can clean, and when it detects them on the system it can clean them. It found 142 of them. Unwanted files. That, within 2-3 seconds.

Because, besides clogging-up your hard drive, the MFT grows out-of-hand, the write function starts to slow-down because the MFT becomes fragmented and large. So, having a huge amount of files on a drive is not always the best thing. So just cleaning-up junk files, stuff like that.

Q: Does it do that per drive?
A: Right now, it checks locations when he does the custom, you can tell it to check other locations.

Q: I can see a need for restricting its use to, say, your main drive, initially. Because many people, including me, have a lot of garbage stuff that exists on junk drives that I use to keep stuff on. It’s something between a backup and a working drive. But my main C: drive and, maybe, my secondary SSD are my two big working spaces. I can see where you wouldn’t want it to start cleaning-up those types of drives.
A: No,it doesn’t touch any other drives. It reads an internal .ini file. The only time it would check for other drives is if you have user profiles on a separate drive. So it will call the Windows API to check where the user profiles are stored. It polls those locations based on the API. Right now, it can do 2400 programs and apps all read form that .ini file. I custom-created that .ini file. I read the .ini file line-by-line and I have my own code to break it apart, so it loads-up everything. My program loads that in 2 seconds vs. 30 seconds for the Windows cleaner version. I just read the file directly and write my own stuff to break it apart.
This is written in VB6 because I don’t like the .net runtimes. So, that’s the kind of stuff I’m able to do with VB6.

Discussion on VB6 followed.

Thanks to Shane Croft for his presentation!

Author: Karen
Written: 5/27/24
Published: 5/27/24
Copyright © 2024, FPP. All rights reserved.