Jump to content

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Find out more here. X
X


Photo

_LargeFileCopy UDF

_largefilecopy

  • Please log in to reply
108 replies to this topic

#21 wraithdu

wraithdu

    this noise inside my head

  • MVPs
  • 2,413 posts

Posted 09 March 2011 - 05:34 PM

Updated, see first post.

@llewxam
I noticed this recently as well, but for copying files over a network. With a lot of small files, under 60K in my case, this function is about 400% faster than FileCopy over a sample of 25 files. However anything over 10MB and FileCopy wins. I tried sample 5MB and 10MB files and both were about the same. On a 25MB file though FileCopy kicked my ass at ~35 sec vs ~75 sec. I think it has to do with disk caching.







#22 llewxam

llewxam

    Universalist

  • Active Members
  • PipPipPipPipPipPip
  • 414 posts

Posted 10 March 2011 - 01:46 AM

Updated, see first post.

@llewxam
I noticed this recently as well, but for copying files over a network. With a lot of small files, under 60K in my case, this function is about 400% faster than FileCopy over a sample of 25 files. However anything over 10MB and FileCopy wins. I tried sample 5MB and 10MB files and both were about the same. On a 25MB file though FileCopy kicked my ass at ~35 sec vs ~75 sec. I think it has to do with disk caching.



EEEEEEEEEEEEE, sorry, but WTF kinda disk are you using, a 20GB 4200RPM IDE Seagate?? :) LOL, sorry for the insult, I know I'm spoiled but DANG, that's under 1MB/s man! My last comp had 2* WD Raptors in RAID-0, new rig has a 120GB OCZ RevoDrive SSD which kills the Raptors! :) Oh, and gigabit LAN also helps to keep me spoiled!!!

Anyway, back to your UDF, I just transferred a 264MB AVI file from one location of my HDD to another, your routine took .34 seconds, FileCopy took .28 seconds. Sending the same file over the LAN to my NAS took your routine 23.52 seconds, FileCopy 24.42 seconds. That means that (maybe just due to the kind of hardware I am running) your routine actually was much closer to FileCopy speeds when sending over the network, 94% actually, vs 82% locally. And at work when I use my Sync Tool with _LargeFileCopy the speed seems to be great. I wonder if the performance is better than you think it is...... Have you tested it in any other environments?

Ian

My projects:

Spoiler

Find something helpful? BTC donations are appreciated!
16aheZKum8J32XPzk8dFj7AzX33An6zJ6d

#23 wraithdu

wraithdu

    this noise inside my head

  • MVPs
  • 2,413 posts

Posted 10 March 2011 - 04:36 AM

I tested at my work copying to servers we use for a particular internal solution. Yeah, LAN transfer speeds SUCK at work, I have no idea why. They are literally that slow even when dragging to copy in Explorer. Yet over the same network we have an intra-server deployment application that will push files around at 10 Mbps.... I've given up trying to understand.

#24 wraithdu

wraithdu

    this noise inside my head

  • MVPs
  • 2,413 posts

Posted 11 March 2011 - 05:33 PM

So I've done a lot of speed testing here at work to find the most efficient way to transfer a lot of small files (<60K). My test case transfers 25 JPGs of varying size, all <60K, in a manner where I can track individual copy progress. This is done so I can display a proper progress dialog during the course of the copy operation. This means not using something like DirCopy.

Copying using my function in serial takes ~30 seconds, while using FileCopy takes ~120 seconds.

So next I wanted to try some external utilities to see what I could accomplish. First up was xcopy both in normal mode and with the /J switch (unbuffered I/O). Normal mode took ~140 seconds, while unbuffered took ~90 seconds. Next was eseutil, which is a utility that comes with Exchange Server to do database maintenance. Another of its functions, although somewhat odd, is an unbuffered file copy. I've seen this tool recommended around the interwebs for network file transfers, and it took ~30 seconds. Last, a small EXE I created that uses my function took about the same ~30 seconds.

My next idea was a parallel copy function. This would require using the same above external utilities and managing process slots. I chose to run 8 copies in parallel (incidentally the default for RoboCopy in multi-threaded mode). The results were a little surprising. xcopy in normal mode took ~95 seconds and in unbuffered mode took ~45 seconds. eseutil and my compiled EXE both won the day at about ~15 seconds.

Considering that eseutil requires a 2MB (?!?!) DLL, I've decided to go with my pre-compiled EXE using the _LargeFileCopy function.

#25 wraithdu

wraithdu

    this noise inside my head

  • MVPs
  • 2,413 posts

Posted 15 April 2011 - 07:14 PM

Updated, see first post.

#26 llewxam

llewxam

    Universalist

  • Active Members
  • PipPipPipPipPipPip
  • 414 posts

Posted 11 November 2011 - 03:16 PM

I love _LargeFileCopy sooo much, and use it in many of my scripts, but there is one issue that I would love to see a fix for - "Source file name exceeded 255 characters". Is there anything you can do about that? I of course can check for failures and use internal commands but.............

Thanks
Ian

My projects:

Spoiler

Find something helpful? BTC donations are appreciated!
16aheZKum8J32XPzk8dFj7AzX33An6zJ6d

#27 wraithdu

wraithdu

    this noise inside my head

  • MVPs
  • 2,413 posts

Posted 11 November 2011 - 05:57 PM

Where are you getting that error from? It's not part of the UDF.

This may be an API limitation, but you should be able to get around it by using UNC path names. Give me a bit more info and I can try to reproduce it.

#28 llewxam

llewxam

    Universalist

  • Active Members
  • PipPipPipPipPipPip
  • 414 posts

Posted 11 November 2011 - 06:59 PM

LOL, sorry, I forgot that I had to manually create that error message. :D But sometimes I would get failures on the copy and added this as a check:
If StringLen($CopySource) > 255 Then $ErrorMessage = "**FAIL** " & Chr(34) & "Source file name exceeded 255 characters: " & Chr(34) & $CopySource & Chr(34) & " -> " & Chr(34) & $CopyDestination & Chr(34)


Soooo, sorry for the lapse of brain function there, but the problem remains that I have issues with the UDF if the pathname is >255 characters. You say to use UNC, I will look in to that and see what it is and if it is applicable. Also, I get this error over network, I was not getting it VIA HDD though. And I also remember that you updated your UDF and mentioned the UNC thing, but I did not update my code to include your newer version.

Looks like I need to look in to some things :oops:

Thanks!
Ian

My projects:

Spoiler

Find something helpful? BTC donations are appreciated!
16aheZKum8J32XPzk8dFj7AzX33An6zJ6d

#29 wraithdu

wraithdu

    this noise inside my head

  • MVPs
  • 2,413 posts

Posted 11 November 2011 - 07:56 PM

See here:

http://msdn.microsoft.com/en-us/library/windows/desktop/aa365247(v=vs.85).aspx

You should be able to use UNC naming conventions for extra long paths >255 characters, ie ?C:somereallylongpathtofile.ext

#30 llewxam

llewxam

    Universalist

  • Active Members
  • PipPipPipPipPipPip
  • 414 posts

Posted 11 November 2011 - 08:20 PM

See here:

http://msdn.microsoft.com/en-us/library/windows/desktop/aa365247(v=vs.85).aspx

You should be able to use UNC naming conventions for extra long paths >255 characters, ie ?C:somereallylongpathtofile.ext


I got some odd behavior from this, the root destination folder got a "011" prefix by inserting ""?" &" to it, and "?UNC" bonked alltogether. BUT, I thank you for pointing me in this direction, and I think that between that suggestion and updating my UDF to your most recent version I should have it working.

Thanks again!
Ian

My projects:

Spoiler

Find something helpful? BTC donations are appreciated!
16aheZKum8J32XPzk8dFj7AzX33An6zJ6d

#31 wraithdu

wraithdu

    this noise inside my head

  • MVPs
  • 2,413 posts

Posted 11 November 2011 - 08:30 PM

011 prefix? That is really odd. Is that coming from my UDF somewhere, or from your script perhaps?

Edit:
This quick test worked ok for me:
#include <_LargeFileCopy.au3> $r = _LargeFileCopy("?" & @ScriptFullPath, "?" & @ScriptDir & "foldercopy.au3", 11) ConsoleWrite($r & " : " & @error & @CRLF)

Edited by wraithdu, 11 November 2011 - 08:34 PM.


#32 llewxam

llewxam

    Universalist

  • Active Members
  • PipPipPipPipPipPip
  • 414 posts

Posted 12 November 2011 - 12:49 AM

011 prefix? That is really odd. Is that coming from my UDF somewhere, or from your script perhaps?

Edit:
This quick test worked ok for me:

#include <_LargeFileCopy.au3> $r = _LargeFileCopy("?" & @ScriptFullPath, "?" & @ScriptDir & "foldercopy.au3", 11) ConsoleWrite($r & " : " & @error & @CRLF)

I have no idea yet, my only GUESS (absolutely no "proof") is that it came from using an old version of your UDF. Like I said earlier, I did not update along with you, I just stuck to what was working because I made some super-small modifications along the way to tally file counts and sizes and just never bothered updating as you posted changes. I just got home from work and sat down, so I plan on updating my Sync tool to use your latest version and re-test.

BTW, what are your thoughts on using "?" & $Path VS "?UNC" & $Path? Long UNC can support 32k+ pathlengths! :D

Thanks
Ian

My projects:

Spoiler

Find something helpful? BTC donations are appreciated!
16aheZKum8J32XPzk8dFj7AzX33An6zJ6d

#33 llewxam

llewxam

    Universalist

  • Active Members
  • PipPipPipPipPipPip
  • 414 posts

Posted 12 November 2011 - 02:15 AM

I did some playing and found that I had to use Long UNC on network paths and Short UNC on local paths:

If StringLeft($FileList[$a], 2) == "" Then       $CopySource = "?UNC" & StringTrimLeft($FileList[$a], 2) Else       $CopySource = "?" & $FileList[$a] EndIf


But combined with the latest version of your UDF that is working great. Now tomorrow comes the real test - THE REAL WORLD, and not my home machines, where paths over 260 don't really happen... :D

Thanks for the help, now time to see if _FileListToArrayXT can be tweaked to handle UNC as well :rip:


EDIT
After a little toying I have my source paths adjusted to be UNC and then send that path off to _FileListToArrayXT, and add a UNC prefix to the destination path before sending that to _LargeFileCopy, all looks good :oops:


Ian

Edited by llewxam, 12 November 2011 - 03:06 AM.

My projects:

Spoiler

Find something helpful? BTC donations are appreciated!
16aheZKum8J32XPzk8dFj7AzX33An6zJ6d

#34 wraithdu

wraithdu

    this noise inside my head

  • MVPs
  • 2,413 posts

Posted 12 November 2011 - 11:33 PM

Nice :D Glad you got it working! Make sure you kept up with the changes I made to the UDF, there are some major script breaking changes in the function parameters.

#35 wraithdu

wraithdu

    this noise inside my head

  • MVPs
  • 2,413 posts

Posted 12 November 2011 - 11:41 PM

I'm also curious... you said you use this at work for network transfers. Have you had any trouble coming from using large write chunk sizes? What size did you end up using?

I ask cause I use this UDF for a utility at work as well. It worked well while at work on the LAN using the default 8 MB chunk size (it copies a lot of small jpgs <50K, so essentially in one write operation). However when working from home on the company VPN it failed quite regularly (SHITTY VPN). I ended up using a 4K chunk size and up to 3 retries to be successful.

#36 llewxam

llewxam

    Universalist

  • Active Members
  • PipPipPipPipPipPip
  • 414 posts

Posted 13 November 2011 - 12:05 AM

I'm also curious... you said you use this at work for network transfers. Have you had any trouble coming from using large write chunk sizes? What size did you end up using?

I ask cause I use this UDF for a utility at work as well. It worked well while at work on the LAN using the default 8 MB chunk size (it copies a lot of small jpgs <50K, so essentially in one write operation). However when working from home on the company VPN it failed quite regularly (SHITTY VPN). I ended up using a 4K chunk size and up to 3 retries to be successful.


Dang, bad VPN is right, your router might need some beefing up, but I am not our company's network engineer so I don't really have anything helpful for you there..... Unless, you just leave your utility on your work machine and run it from there?? Have it pull your data, rather than you trying to push it in, that way it can run at whatever speeds it is capable of? hmmm...

I have stuck to your default 8MB buffer, long long ago I was testing things from 4KB up to 32MB or so, and honestly I almost never noticed any difference at all except for the absurdly small settings so have left it alone. I never have had a problem with the transfer that is attributable to your UDF.

Last thing, I found another odd prefix getting in there today when I stopped at work to test my new Sync build, and had to really re-do my UNC implementation. I think I have it down now, but I had to skip applying it to _FileListToArrayXT which is fine since that was never really the problem anyway.

Take care
Ian

My projects:

Spoiler

Find something helpful? BTC donations are appreciated!
16aheZKum8J32XPzk8dFj7AzX33An6zJ6d

#37 wraithdu

wraithdu

    this noise inside my head

  • MVPs
  • 2,413 posts

Posted 13 November 2011 - 06:32 AM

My home connection is fine, and internet speed tests through the VPN max out my connection. But communication with internal work machines is bad. I think it's a latency/connection issue rather than a speed issue. Either way it sucks and there's little I can do about it.

Regarding the prefixes... still very odd. Are you using regular expressions or StringFormat at all, maybe getting something weird with octal codes in there? 011 is octal for TAB...

#38 llewxam

llewxam

    Universalist

  • Active Members
  • PipPipPipPipPipPip
  • 414 posts

Posted 13 November 2011 - 03:30 PM

My home connection is fine, and internet speed tests through the VPN max out my connection. But communication with internal work machines is bad. I think it's a latency/connection issue rather than a speed issue. Either way it sucks and there's little I can do about it.

Regarding the prefixes... still very odd. Are you using regular expressions or StringFormat at all, maybe getting something weird with octal codes in there? 011 is octal for TAB...

Nah, here is a snippet of what I am doing:

If StringLeft($Source, 2) == "" Then $SourceUNCPrefix = "?UNC" _ArrayTrim($FileList, 1, 0, 1) $SourcePathLength -= 1 Else $SourceUNCPrefix = "?" EndIf If StringLeft($Target, 2) == "" Then $TargetUNCPrefix = "?UNC" $Target = StringTrimLeft($Target, 1) Else $TargetUNCPrefix = "?" EndIf _Copy($SourceUNCPrefix & $CopySource, $TargetUNCPrefix & $Destination, $CopySize)


_Copy just leads to another routine that checks to see if the file needs to be synced and if True then it sends the paths off to _LargeFileCopy, so no translation at all happens there. I now have an odder problem though, in that when the Target is a network path I get an error "Verify Failed".which is @error 6.....but I have removed your whole verify section! There isn't even a SetError(6,0,0) present, AND it works fine copying anything to a local drive. EH??? hehe

Ian

My projects:

Spoiler

Find something helpful? BTC donations are appreciated!
16aheZKum8J32XPzk8dFj7AzX33An6zJ6d

#39 wraithdu

wraithdu

    this noise inside my head

  • MVPs
  • 2,413 posts

Posted 13 November 2011 - 04:32 PM

Do you have another version of the UDF hanging around in the local directory somewhere, or a main include path? I return error 6 from two places at the end of the function, maybe you missed one?

Edit:
I also had a thought about the weird prefix. I use some regular expressions in the _LFC_CheckDestination function, which is where the destination directory is created if it does not exist with flag 2. Maybe the old ones had problems with UNC paths.

Edited by wraithdu, 13 November 2011 - 04:38 PM.


#40 llewxam

llewxam

    Universalist

  • Active Members
  • PipPipPipPipPipPip
  • 414 posts

Posted 13 November 2011 - 06:40 PM

Do you have another version of the UDF hanging around in the local directory somewhere, or a main include path? I return error 6 from two places at the end of the function, maybe you missed one?

Edit:
I also had a thought about the weird prefix. I use some regular expressions in the _LFC_CheckDestination function, which is where the destination directory is created if it does not exist with flag 2. Maybe the old ones had problems with UNC paths.

I don't even call _LargeFileCopy from an include, I put it directly in since i had to add a few things to it for my _SpeedCalc function to work properly. But just incase I deleted the AU3 from my ScriptDir, commented out _LFC_CheckDestination, and stripped everything out of _LFC_CreateFile except the DLLCall itself, and the same thing continues to happen:

**FAIL** "Verify failed: "?UNCOLE-DRIPPYstuffCodeAutoIt Machine Code Algorithm CollectionHashECHO.au3" -> "?UNCOLE-DRIPPYwritableHashECHO.au3"


I just found that when I send the path including the UNC prefix to _LargeFileCoy the prefix IS being removed; when I apply it just prior to _LFC_CreateFile the script is able to copy the files successfully but I get the Verify Failed message, and if there is an extra directory in the path the error is Failed to create destination file. I may experiment with just using _WinAPI_CreateFile to see what happens, but another confusing aspect is that I have code in place to create new folders which failed as well! hmmmm, adding UNC is causing a lot more trouble than I expected!!!!

Ian

My projects:

Spoiler

Find something helpful? BTC donations are appreciated!
16aheZKum8J32XPzk8dFj7AzX33An6zJ6d





Also tagged with one or more of these keywords: _largefilecopy

0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users