© J.C. Kessels 2009
MyDefrag Forum
May 22, 2013, 09:33:56 am
Welcome,
Guest
. Please
login
or
register
.
1 Hour
1 Day
1 Week
1 Month
Forever
Login with username, password and session length
News
:
Home
Help
Search
Login
Register
MyDefrag Forum
>
JkDefrag v3 Forum
>
Requests for new features
>
Zoned Sorting optimization + Frequency parameter idea
Pages: [
1
]
« previous
next »
Print
Author
Topic: Zoned Sorting optimization + Frequency parameter idea (Read 3721 times)
poutnik
JkDefrag Hero
Posts: 1105
Zoned Sorting optimization + Frequency parameter idea
«
on:
June 12, 2008, 04:38:53 pm »
Hi Jeroen,
You may have already considered all these ideas, but what if not.....
It is well known Sorting optimization can optimize disk according to personal preferences,
but it costs full disk rearrangement.
On the other hand, IMHO, having all files perfectly sorted is not worthy enough to do all that work.
I have no clear idea, how zones in Version 4 will work, so what about
Zoned sorting optimizations
?
There will be
given
( implicit or explicite )
number of zones and parameter limits
.
All files would belong to one of zones, that can be managed
by
same algorithm as fast optimization manages zone 2 and 3.
There can be the same optional gaps as there are in -a3 zones, mayby optionally not between all zones.
Optimal zone number can be a question of analysis or personal preferences.
Just rough example:
Sort by modified time
Zone 1 - modification older than 3 months
Zone 2 - modification older than 1 month
Zone 3 - modification older than 1 week
Zone 4 - modification in last 7 days
Sort by access time
( until better frequency parameter is available )
Zone 1 - accessed in last hour
Zone 2 - accessed in last 24 hours
Zone 3 - accessed in last week
Zone 4 - accessed in last month
Zone 5 - Others
IMHO such kind of optimization could be run on daily basis, as fast optimization is run now.
Related idea about
last access versus frequency
problem without extra driver :
- small scheduled utility that would run every 1-4 hour in idle priority
- frequency would be set equal to zone 1-2 limit for access frequency sorting
- at 1st run utility will scan file access times and record them
- at every next run it will compare scanned access times with previous recorded ones
- if there is a change it will record access time and increase access counter
- it will keep last N access times
- So does it will keep access counts for an interval equal to last - last but one zone limit ( maybe twice as long to be stable )
( variation - keeping data only to end of next zones )
- Here are sample zones for "
Sort by frequency
" (example):
zone 1 - accessed "all the time" - i.e each scan trigger increasing counter
zone 2 - accessed daily at least once
zone 3 - accessed weekly at least once
zone 4 - accessed monthly at least once ( or once in monitoring interval )
zone 5 - least accessed files
Logged
It can be fast, good or easy. You can pick just 2 of them....
Treating Spacehog zone by the same effort as Boot zone is like cleaning a garden by the same effort as a living room.
jeroen
Administrator
JkDefrag Hero
Posts: 7155
Re: Zoned Sorting optimization + Frequency parameter idea
«
Reply #1 on:
June 12, 2008, 05:48:17 pm »
Quote from: poutnik on June 12, 2008, 04:38:53 pm
I have no clear idea, how zones in Version 4 will work, so what about
Zoned sorting optimizations
?
Thanks for sharing your ideas, I appreciate it! Version 4 will have a small scripting language. One of the things that will be possible is to define as many zones as you want, and to select an optimization method per zone.
Quote
Related idea about
last access versus frequency
problem without extra driver :
Your idea depends on the last access time being update by Windows. But Vista does not do that, by default, and many people have turned it off on XP for performance reasons..
Logged
poutnik
JkDefrag Hero
Posts: 1105
Re: Zoned Sorting optimization + Frequency parameter idea
«
Reply #2 on:
June 12, 2008, 06:38:34 pm »
Quote from: jeroen on June 12, 2008, 05:48:17 pm
Thanks for sharing your ideas, I appreciate it! Version 4 will have a small scripting language. One of the things that will be possible is to define as many zones as you want, and to select an optimization method per zone.
I have thought it would be probably like that.
Quote from: jeroen on June 12, 2008, 05:48:17 pm
Your idea depends on the last access time being update by Windows. But Vista does not do that, by default, and many people have turned it off on XP for performance reasons..
Sure it depends. Driver solution is no doubt more general. The question could be, if anybody not caring about last access stamp would care about frequency sorting. And what will be driver solution performance and stability, compared with last access on.
Well, if based of usage superfetch statistics, it could be great and efficient.
But, it could be limited to executables (all or sime ) only.
In fact I did not realized performance drop when I turned access stamps on.
Probably because timestamps are not updated in real time.
technet2.microsoft.com
Quote
The Last Access Time on disk is not always current. This lag occurs because NTFS delays writing the Last Access Time to disk when users or programs perform read-only operations on a file or folder, such as listing the folder's contents or reading (but not changing) a file in the folder. If the Last Access Time is kept current on disk for read operations, all read operations become write operations, which impacts NTFS performance.
Note that file-based queries of Last Access Time are accurate even if all on-disk values are not current. NTFS returns the correct value on queries because the accurate value is stored in memory.
NTFS typically updates a file's attribute on disk if the current Last Access Time in memory differs by more than an hour from the Last Access Time stored on disk, or when all in-memory references to that file are gone, whichever is more recent. For example, if a file's current Last Access Time is 1:00 P.M., and you read the file at 1:30 P.M., NTFS does not update the Last Access Time. If you read the file again at 2:00 P.M., NTFS updates the Last Access Time in the file's attribute to reflect 2:00 P.M. because the file's attribute shows 1:00 P.M. and the in-memory Last Access Time shows 2:00 P.M.
NTFS updates the index of the directory that contains the file when NTFS updates the file's Last Access Time and detects that the Last Access Time for the file differs by more than an hour from the Last Access Time stored in the directory's index. This update typically occurs after a program closes the handle used to access a file within the directory. If the user holds the handle open for an extended time, a lag occurs before the change appears in the index entry of the directory.
Note that one hour is the maximum time that NTFS defers updating the Last Access Time on disk. If NTFS updates other file attributes such as Last Modify Time, and a Last Access Time update is pending, NTFS updates the Last Access Time along with the other updates without additional performance impact.
«
Last Edit: June 13, 2008, 06:33:24 am by poutnik
»
Logged
It can be fast, good or easy. You can pick just 2 of them....
Treating Spacehog zone by the same effort as Boot zone is like cleaning a garden by the same effort as a living room.
poutnik
JkDefrag Hero
Posts: 1105
Re: Zoned Sorting optimization + Frequency parameter idea
«
Reply #3 on:
June 16, 2008, 09:09:49 am »
Quote from: jeroen on June 12, 2008, 05:48:17 pm
One of the things that will be possible is to define as many zones as you want, and to select an optimization method per zone.
Maybe it would not be so bad idea doing "subzone" sorting internally, without user intervention.
Done within zone, either default, either user defined.
Strict sorting is IMHO too much effort for too low gain.
E.g. Perfect disk is using 32 zones, well, I do not know how it works with them in details.
Logged
It can be fast, good or easy. You can pick just 2 of them....
Treating Spacehog zone by the same effort as Boot zone is like cleaning a garden by the same effort as a living room.
tOM Trottier
JkDefrag Hero
Posts: 82
tOM
Re: Zoned Sorting optimization + Frequency parameter idea
«
Reply #4 on:
June 16, 2008, 11:33:05 pm »
It matters where the zones are. The real estate has different values (view, convenience, ...)
Stuff near the middle has less lag to get to. (Ideal for pagefile, directories, ...)
Stuff near the start is read faster (Ideal for big, often-read sequential files, like EXEs)
Stuff near the end is slow to read and slow to get to (Ideal for logs, MP3s, spacehogs, logs)
Ideally, free space should also be near the middle so it is quickly gotten to.
tOM
Logged
poutnik
JkDefrag Hero
Posts: 1105
Re: Zoned Sorting optimization + Frequency parameter idea
«
Reply #5 on:
June 17, 2008, 05:34:50 am »
I cannot disagree - well known facts.
But I am not fully sure, to which part of topic it is related.....
My post was about how to sort a zone - no matter where it is and how big - when you have decided to sort the zone by some criterium.
I should better mark zones in "zone sorting" as subzones, or internal zones, as I though about later.
Mixing zones and zones was confusing.
«
Last Edit: June 17, 2008, 05:41:45 am by poutnik
»
Logged
It can be fast, good or easy. You can pick just 2 of them....
Treating Spacehog zone by the same effort as Boot zone is like cleaning a garden by the same effort as a living room.
tOM Trottier
JkDefrag Hero
Posts: 82
tOM
Re: Zoned Sorting optimization + Frequency parameter idea
«
Reply #6 on:
June 29, 2008, 03:59:25 am »
We talk about sorting because it is a simple way to group stuff on some dimension. But does it matter if file C follows B follows A on any characteristic of that file? No.
If you knew the order they were accessed together, and the delays between one file access and the next, you could put an upper bound on how far apart they can be before access time grows. I think this what BootVis, XP's boot load optimiser does.
For general disk optimisation, ideally, you would like a similar determination - but this is not possible on a global basis. It can only be done for files - keeping all the clusters together - and applications - keeping the DLLs and data nearby, and then multitasking shuffles the accesses all up anyway.
So on a global basis, you want to stuff the seldom accessed stuff near each end of the disk, especially the slow end, away from the middle, and do all your accesses near the middle where it's only a millisecond or two between tracks/cylinders. The most heavily used storage should be the most "middle", and you access only small parts, at the very middle. This are directory entries, indexed database files, etc.
But the middle is not exactly the physical middle. If you put all the seldom referenced stuff near the slow end, you can consider that not even part of the disk - so the logical middle slides towards the fast end.
And the fast end you fill with big sequential files like EXEs.
tOM
Logged
poutnik
JkDefrag Hero
Posts: 1105
Re: Zoned Sorting optimization + Frequency parameter idea
«
Reply #7 on:
June 29, 2008, 11:14:51 pm »
thank you for your ideas.
To be more clear,
I suggested above mentioned Frequency parameter idea only just as a kind of access sorting. Nothing more or less, and I was not pretending it is the best one.,
So Neither I claimed access-like sorting as best approach in all cases :-)
I did not intend to switch this thread into discussion what sorting / if any should be done,
or what file placement is the best.
similarly zone sorting was for case we want to sort by whatever criteria.
If sort, and what criteria to sort on, was not intended as subject of the thread.
Files ( letter means whatever criteria order)
ACGEHFDB can be sorted like
ABCDEFGH ( strict sort ) - expensive, low gain/cost ratio, being sorted fully.
ACBDHFGE ( least sorted, 2 subzones A-D, E-H ) - even partial sort can give significant gain/cost ratio.
Or whatever between these........
Quote from: tOM Trottier on June 29, 2008, 03:59:25 am
But does it matter if file C follows B follows A on any characteristic of that file? No.
Exactly the idea I am trying to say.
Quote from: tOM Trottier on June 29, 2008, 03:59:25 am
So on a global basis, you want to stuff the seldom accessed stuff near each end of the disk, especially the slow end, away from the middle, ...... The most heavily used storage should be the most "middle", and you access only small parts, at the very middle. This are directory entries, indexed database files, etc.
It very depends on partition role. For system one it is valid.
Quote from: tOM Trottier on June 29, 2008, 03:59:25 am
But the middle is not exactly the physical middle. If you put all the seldom referenced stuff near the slow end, you can consider that not even part of the disk - so the logical middle slides towards the fast end.
Known fact. My logical middle of data, whole disk counting, is cca at 42% of physical tracks,
and physical middle at 58% of logical contents.
Quote from: tOM Trottier on June 29, 2008, 03:59:25 am
And the fast end you fill with big sequential files like EXEs.
If it is system partition and files are accessed often..
«
Last Edit: June 29, 2008, 11:55:35 pm by poutnik
»
Logged
It can be fast, good or easy. You can pick just 2 of them....
Treating Spacehog zone by the same effort as Boot zone is like cleaning a garden by the same effort as a living room.
Pages: [
1
]
Print
« previous
next »
Jump to:
Please select a destination:
-----------------------------
MyDefrag v4 Forum
-----------------------------
=> Announcements
=> Questions and help
=> Bugs and problems
=> Requests for new features
=> Scripts, and other contributions
-----------------------------
JkDefrag v3 Forum
-----------------------------
=> Announcements
=> Questions and help
=> Bugs and problems
=> Requests for new features
=> Programming with the library
Loading...