Intro Download and install Frequently Asked Questions Tips and tricks

Homepage







© J.C. Kessels 2009
MyDefrag Forum
May 22, 2013, 02:52:00 am *
Welcome, Guest. Please login or register.

Login with username, password and session length
News:
 
   Home   Help Search Login Register  
Pages: [1]
  Print  
Author Topic: Bug or inefficient release?  (Read 1462 times)
amk
JkDefrag Hero
*****
Posts: 101



View Profile
« on: July 30, 2009, 03:15:25 pm »

When MyDefrag clear Gap (AddGap, MakeGap) or place for moving file (FastFill), it move entire obstacle file starting at offset 0. Because no limits to new placement of this file exist, it frequently placed at next gap place (and moving begins again).
I think more effective to move to upper place only fragments placed at clearing area.
Entire file may by moved only if that process appreciably lower it's fragmentation
Logged
jeroen
Administrator
JkDefrag Hero
*****
Posts: 7155



View Profile WWW
« Reply #1 on: July 30, 2009, 04:17:59 pm »

When MyDefrag clear Gap (AddGap, MakeGap) or place for moving file (FastFill), it move entire obstacle file starting at offset 0.
No, the program only moves the fragment that is in the way. Not the entire file. Also it is not moved to the next gap, but to the next zone. Perhaps your disk is very full? Then the program (has to) fall back to using the biggest gap above the area to be vacated.
Logged
SteveB
JkDefrag Junior
**
Posts: 8


View Profile
« Reply #2 on: July 31, 2009, 12:43:14 pm »

Jeroen,

Quote
Also it is not moved to the next gap, but to the next zone
Sorry, but that's not strictly 100% correct based on the bahaviour I've observed... which brings me to my slightly wider issue, and a suggested resolution.

With apologies for what will overall be a long post, a brief digression is probably in order at this point to explain how I encountered/noticed this problem. In the default FastOptimize script each zone has a "AddGap(UntilPercentageOfVolumeMultiple(x))" action after the main FileAction. I'm assuming this is in order to align the start of the next zone to the same position between one run of MyDefarg and the next (until the zone overflows that boundary when it'll move by that amount "in one go"), thus minimising the 'never ending' movement of files in the next zone that'd otherwise result from every change in the current zone.

In slightly modifying the script for my own use I kept that (very sensible) idea, although changing the multiples' measurement types and sizes. I also wanted to ensure that certain free spaces were left so that new files could be created in the faster parts of the disk, with distinct gaps between directories and 'boot' files, and between regular files and spacehogs (very much like the old jkdefrag default strategy). I therefore added a second AddGap(..'fixed %'..) action in order to guarantee a certain minimum size for the overall gap, e.g.:
        FileSelect
          Directory(yes)
        FileActions
          Defragment()
          FastFill()
          AddGap(UntilMegabytesMultiple(16))
          AddGap(PercentageOfVolume(1))
        FileEnd

Several processes - recording TV in Media Center or taking a system image, for example - regularly create new multi-gigabyte files, which Windows will tend to place near-ish the start of the drive. Just from keeping an eye on the first few test-runs of the script I could observe the following:

  • A very large (5-10GB) new file had been placed just after the directories as that gap was the first large enough to hold it; it was non-fragmented.
  • Upon encountering the first AddGap (as above) the whole file was moved to beyond the end of that (0-16MB) gap - note: not beyond the end of the whole zone as you indicated.
  • The second AddGap then moved the whole file again to beyond the end of that (1%) gap - since that butted up to the start of the boot files it was placed in the next gap large enough, at the end of that boot files zone.
  • The boot files were processed, and then the AddGap at the end of that zone once again moved the whole file, this time to the gap at the end of the regular files zone.
  • The regular files were processed, and then the AddGap at the end of that zone moved the whole file out beyond the specified "UpToMultiple".
  • The second fixed-size AddGap (I'd added) in that zone then moved the whole file again to beyond the end of that gap - since that butted up to the start of the spacehog files it was placed in the next gap large enough, at the end.
  • If there was sufficient "tightening up" done in the spacehogs zone the whole file was once more moved, this time back down as a result of the FastFill() of that zone.

To move the file into the correct zone MyDefrag has re-written it 5-6 times! Although I only spotted it because of the time taken to move a >10GB and seeing the same filename appear several time in quick succession, this multiple-moving obviously must happen for every file that's nearer the start of the disk than it's eventual assigned zone, with the number of re-writes differing according to:
  • Fewer re-writes occur without the "doubled up" AddGap()'s (used to ensure a minimum gap size whilst still maintaining alignment of the next zone's start between iterations), but the default FastOptimize script can still re-write the same file 4-5 times.
  • Non-spacehog files will be re-written fewer times according to which zone they are selected in, and files that're initially placed after the first defined zone/gap (or more, but still before their target zone) will also have correspondingly fewer re-writes.
  • Files can be re-written even more times if there are gaps large enough to hold them within zonesthat're between their initial position and their target zone - they will/could be put there when being vacated from the earlier gap and then will be moved out of there when the normal (non-gap) FileActions for that zone are performed.

I believe this produces a significant overall reduction in performance. The "ideal" solution would probably be to do some form of "pre-analysis" of the whole script as well as analysing the drive, and doing some calculation as to a "best-guess" final location whenh vacating a file - but I strongly suspect this is either impossible or at least excessively complex to attempt.

Instead I suggest the following outline algorithm should be used when vacating a file - whether from a gap or moving it "out of the way" during other file actions - that isn't selected for the current zone:
  • If there is a large enough gap immediately following the last file on the disk, move it there.
  • If not, find the last gap on the disk that's big enough.
  • If there's no gap large enough, start filling gaps from the end of the disk forwards.

In almost all cases, this will mean that a file is moved only twice - once 'out' as above, and once back to its final location when processing the zone it's selected in. Because it's vacated to a later part of the disk than currently the moves will be very slightly slower, but I believe that should be more than compensated for by the abscence of additional third, fourth, fifth, etc. moves.

Lesser "edge case" factors would be:
  • For a small number of cases it will mean two moves where one would have been required currently - but that's only when the file being vacated: belongs in the immediately subsequent zone, is not moved again by a subsequent FileAction in the current zone, and is not moved by the FileAction(s) of the subsequent zone.
  • In a few cases - spacehogs that don't get subsequently re-positioned when processing that final zone - it will result in only one move where the current approach may move it multiple times (3+ times in the default FastOptimize if it was written into the gap after directories).
  • If a script doesn't select some files at all in any zone, but doesn't actually exclude them, they may be 'vacated' to beyond the last file or to the end of the disk and never brought back, leaving them at a slower disk position than necessary.


Obviously I'm not as familiar with the exact performance characteristics and trade-offs as you (or some others here) but it seems as though this should give an overall speed-up in almost all cases, with relatively little degradation in the worst case scenarios. In any case, the suggested algorithm is proposed for your consideration.

Many thanks for your excellent efforts,
Steve


[As an aside: is there a way to analyse how many times in a run a file (or fragment thereof) is moved - maybe from a more detailed debug log? A distribution of # moves per file, perhaps by file size, might be interesting.]
Logged
amk
JkDefrag Hero
*****
Posts: 101



View Profile
« Reply #3 on: July 31, 2009, 02:05:32 pm »

SteveB, for economy You may replace
Quote
AddGap(UntilMegabytesMultiple(16))
AddGap(PercentageOfVolume(1))
to
Quote
AddGap(UntilMegabytesMultiple(16) PercentageOfVolume(1))

it is not moved to the next gap, but to the next zone.
Free space in next zone is gap at end of zone. Next action is clearing 2nd zone free space and moing file to 3rd zone and so...
Logged
SteveB
JkDefrag Junior
**
Posts: 8


View Profile
« Reply #4 on: July 31, 2009, 06:48:09 pm »

amk,

Thanks very much for that tip - an initial test seems to confirm that it at least removes the "double moving" that happens if the two gap specifications are done in separate statements. This slightly mitigates but doesn't resolve the general issue raised.

Regarding your second comment, if I understand your point correctly: it isn't necessarily the case that a file being vacated from one zone will be moved into free space at the end of the next zone. If there's a large enough space to contain the file anywhere within the set of files in the next zone - e.g. as a result of changes since the last defrag - then it's moved there and not further out to a gap defined at the end of that zone.

What I saw before, but haven't explicitly re-confirmed with the tweak to a single AddGap line, is that in such a situation the "incorrect" file in the middle of a zone will be moved out during a FileAction such as FastFill into the first large-enough space beyond its own scope, not taking into account any subsequent AddGap in that zone. I.e. it'll be moved into the end-of-zone gap and then moved again soon after to vacate that gap.

Assuming that remains the case, it means that a newly created file in the gap after the ntfs files, that matches the definition of spacehogs but also fits into 1 or more gaps in both the boot and regular files zones (a 4MB mp3 file, say), will still be re-written 6 times by the FastOptimize script (or 7 times if it's also re-positioned during the FastFill of the last zone).
Logged
amk
JkDefrag Hero
*****
Posts: 101



View Profile
« Reply #5 on: July 31, 2009, 07:22:43 pm »

As wroted above file moved to biggest continuous free area. On heavy filled disk it is frequently gap at end of next zone.
This behavior cannot be changed without preclassification of files. May be jeroen make this in future.

But talk not about this.
On one computer clearing space procedure give strange behavior. It move not interference fragment, but entire file.
I fail reproduce this behavior on home PC. There all work right.
Logged
jeroen
Administrator
JkDefrag Hero
*****
Posts: 7155



View Profile WWW
« Reply #6 on: August 01, 2009, 09:19:08 am »

the whole file was moved to beyond the end of that (0-16MB) gap - note: not beyond the end of the whole zone as you indicated.[/li][/list]
Files are moved to above the end of the current zone, which in your case happens to be the beginning of an AddGap. The AddGap is a zone to MyDefrag, so it moves files to beyond the end of the gap. In your case that happens to be the beginning of another AddGap. The tip from amk works, it combines the gaps into a single gap.

Quote
The "ideal" solution would probably be to do some form of "pre-analysis" of the whole script as well as analysing the drive, and doing some calculation as to a "best-guess" final location whenh vacating a file - but I strongly suspect this is either impossible or at least excessively complex to attempt.
It would be a good solution, yes. It has been suggested before and I have it on my wishlist. Problem is that such a pre-analysis will take a long time. The multiple movement of a file as you have noticed does not happen very often, and large files will usually go immediately to the end of the disk. So I'm not sure if the benefits will outweigh the extra calculation time.

Quote
Instead I suggest the following outline algorithm
It's a good idea, but you have happened upon a worst-case situation. Vacating files will usually happen on files that belong to the zone itself, especially when the zone is being sorted. Practically all the files in the zone will have to be moved away, so they can be placed back again in the selected ordering. In this case moving files to the end of the disk would make the program very slow. It is a lot better to vacate to a location as near above the end of the zone as possible.

Quote
is there a way to analyse how many times in a run a file (or fragment thereof) is moved - maybe from a more detailed debug log?
Uncomment the "Debug(175)" setting in your "C:\Program Files\MyDefrag v4.1.1\Scripts\Settings.MyD" script for more information in the debug logfile.
Logged
SteveB
JkDefrag Junior
**
Posts: 8


View Profile
« Reply #7 on: August 01, 2009, 11:46:19 am »

Jeroen,
Many thanks for your responses.

Quote
The AddGap is a zone to MyDefrag...
Glad we agree on the behaviour. I was misinterpreting your use of "zone", assuming you meant "the whole area covered by a FileSelect and all its associated FileActions" (or a separate MakeGap, of course), which seems to be how it's presented in the GUI in terms of the action descriptions - "Zone 1: Vacating gap", etc.

Quote
Problem is that such a pre-analysis will take a long time
Indeed - and be hideously difficult, and still subject to change when encountering unmovable files. I'm actually slightly surprised it's even on your wishlist - while it may be theoretically brilliant I just don't see it as a realistic, practical approach! The best that I can see being reasonable is to calculate not only the projected end of a Sort/Fill action's zone but also the end of any immediately subsequent Gap - then if the file being moved away belongs in the current zone it's moved after the first boundary, but if it doesn't it's moved after the second.

Quote
The multiple movement of a file as you have noticed does not happen very often, and large files will usually go immediately to the end of the disk. So I'm not sure if the benefits will outweigh the extra calculation time.
I'm not convinced about the first sentence (either part), I'm afraid; more below. Totally agree with the last statement, which is why I suggested a "less idealistic" approach.

Quote
It's a good idea, but you have happened upon a worst-case situation. Vacating files will usually happen on files that belong to the zone itself, especially when the zone is being sorted. Practically all the files in the zone will have to be moved away, so they can be placed back again in the selected ordering.
I used the worst case as an example to highlight the issue, and because that's what I actually observed in the first place (the reason I noticed the issue) - but I'm not at all sure that "bad" cases are necessarily that rare.

You're dead right about the Sort actions - I guess because I use the FastOptimize approach I wasn't thinking about those scenarios. When FastFill-ing I'm not sure that files being vacated would usually belong to the zone itself - why are they being vacated in that case? When vacating from defined gaps, of course, the statement must be false since the files are by definition not members of the current zone (in either interpretation of "zone").

Maybe there's another little bit of 'terminology clash' here? There are several different scenarios to cover:
  • Moving multiple fragments of a file to a new location to defragment it. I've not been considering this as being "vacating" a file - not sure if you have (e.g. if all file 'moves' are effectively the same in the code?). Wasn't really included in my issue, although if the defragmentation is being done as a "side effect" of it being moved away for one of the reason below then the same comments below would apply.
  • Moving a file away to create space while doing a SortBy action or MoveDownFill. I hadn't been thinking about this at all, but you're right: the existing approach should be used, for the reasons you state.
  • Moving a file out of a FastFill action's zone because it doesn't belong there. This suffers from the problem I outlined. The only situation in which it won't be moved multiple times is if: there is no AddGap or MakeGap between the FastFill and the next FileSelect, or all such defined gaps do not have a space large enough for the file; and the file is matched by the very next FileSelect; and there is enough free space for the file either in a gap within the area of files matched by the next FileSelect or immediately following its last file.
  • Moving a file out of a defined gap (AddGap or MakeGap without DoNotVacate). Again, this can often suffer from the problem. The file is moved once only if the file is matched by the very next FileSelect, and there is enough free space for the file either in a gap within the area of files matched by the next FileSelect or immediately following its last file (and assuming there's no additional Add/MakeGap before the next FileSelect, which is reasonable given multiple gap specifications can be used in a single line). Otherwise it'll be moved at least twice, or more.
Are there other cases I've missed that aren't covered by the above?

The first two are fine with the current approach, whereas IMO the last two suffer from 'unnecessary' multiple moves more often than not. Whether it'd even be practical to use separate strategies for two different purposes of file moves is of course up to you.

In the meantime, it seems the onus may be on me to provide some evidence that the issue occurs often enough for a fix to be worthwhile (or whether my basic observations are unconciously biased, and not actually representative of real behaviour). I'll try to hack up a bit of vbscript to analyse some debuglog files, and get some stats on what proportions of files are moved how many times during a FastOptimize.
Logged
jeroen
Administrator
JkDefrag Hero
*****
Posts: 7155



View Profile WWW
« Reply #8 on: August 01, 2009, 12:04:51 pm »

Thank you very much for sharing your thoughts, I appreciate it. I also appreciate you trying to help. But I don't have time to read lengthy postings or to discuss theory. Sorry, but I have to prioritize.
Logged
Pages: [1]
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.5 | SMF © 2006-2008, Simple Machines LLC Valid XHTML 1.0! Valid CSS!