Jump to content
Sign in to follow this  
spac3m0nk3y

Splitting a CSV file every 65535 lines

Recommended Posts

spac3m0nk3y

I need to read in a CSV file and split it up into a new file every 65536 lines.  The reason is because we are using an older version of Excel that can only view 65536 lines at a time.  Please also note that we do NOT have Excel installed on the PC where the script will run.

Here is what I have so far:

#include <file.au3>

$sFilePath = @ScriptDir & "\testFile.csv"
If NOT FileRead($sFilePath) Then
    ConsoleWrite("DEBUG: Error: " & @error & @CRLF)
    Exit
Else
    ConsoleWrite("DEBUG: Lines: " & _FileCountLines($sFilePath) & @LF)
EndIf

$i = 65536
$hFileRead = FileRead($sFilePath)
$i_pos = StringInStr($hFileRead, @CRLF, 0, $i)
FileWrite(@ScriptDir & "\first.csv", StringLeft($hFileRead, $i_pos - 1))
FileWrite(@ScriptDir & "\second.csv", StringMid($hFileRead, $i_pos + 2))

I pieced this together from a couple other threads.  I can split up the file into two files, but I need to keep going, so that if "testFile.csv" has 150,000 lines, it gets split up into 3 files, or if it has 200,000 lines, it gets split up into 4 file, etc.  I'm not sure how to incorporate this into my script.  I was thinking about a While loop, but I'm not sure how to do it.

Share this post


Link to post
Share on other sites
Jos

Untested but should be close:

$sFilePath = @ScriptDir & "\testFile.csv"
$hFR = Fileopen($sFilePath)

$of = 1
$i = 65536

while 1
        $fho = FileOpen(@ScriptDir & "\output" & $of & ".csv",2)
        for $x = 1 to $i
            $irecord = FileRead($hFR)
            if @error then ExitLoop 2
            FileWrite($fho,$irecord)
        Next
        FileClose($fho)
        $of += 1
WEnd

Jos :)


Visit the SciTE4AutoIt3 Download page for the latest versions  - Beta files                                How to post scriptsource        Forum Rules
 
Live for the present,
Dream of the future,
Learn from the past.
  :)

Share this post


Link to post
Share on other sites
spac3m0nk3y

Untested but should be close:

$sFilePath = @ScriptDir & "\testFile.csv"
$hFR = Fileopen($sFilePath)

$of = 1
$i = 65536

while 1
        $fho = FileOpen(@ScriptDir & "\output" & $of & ".csv",2)
        for $x = 1 to $i
            $irecord = FileRead($hFR)
            if @error then ExitLoop 2
            FileWrite($fho,$irecord)
        Next
        FileClose($fho)
        $of += 1
WEnd

Jos :)

Running that just puts everything in output1.csv and keeps creating output#.csv files.

Share this post


Link to post
Share on other sites
spac3m0nk3y

Did you try it and was that the outcome?

Jos

Tried it, it creates output1.csv with the contents as testFile.csv and then keeps creating empty output#.csv files until I kill the script.

Share this post


Link to post
Share on other sites
UEZ

Here another version:

#include <File.au3>

$sFilePath = @ScriptDir & "\test.csv"
SplitCSVtoNewFile($sFilePath, 50)

Func SplitCSVtoNewFile($sFilePath, $iLine)
    Local $aCSV_Lines = StringSplit(StringStripCR(FileRead($sFilePath)), @LF, $STR_CHRSPLIT)
    If @error Then Return SetError(1, 0, 0)
    Local $i = 1, $c = 1, $sLine, $hFile, $d = StringLen(UBound($aCSV_Lines))
    Do
        $sLine &= $aCSV_Lines[$i] & @CRLF
        If Not Mod($i, $iLine) Then
            $hFile = FileOpen(StringTrimRight($sFilePath, 4) & StringFormat("%0" & $d & "i", $c) & ".csv", $FO_OVERWRITE)
            FileWrite($hFile, StringTrimRight($sLine, 2))
            FileClose($hFile)
            $sLine = ""
            $c += 1
        EndIf
        $i += 1
    Until $i = $aCSV_Lines[0]
    If $sLine <> "" Then
        $hFile = FileOpen(StringTrimRight($sFilePath, 4) & StringFormat("%0" & $d & "i", $c) & ".csv", $FO_OVERWRITE)
        FileWrite($hFile, StringTrimRight($sLine, 2))
        FileClose($hFile)
    EndIf
EndFunc

 

Edited by UEZ

Please don't send me any personal message and ask for support! I will not reply!

Selection of finest graphical examples at Codepen.io

The own fart smells best!
Her 'sikim hıyar' diyene bir avuç tuz alıp koşma!
¯\_(ツ)_/¯  ٩(●̮̮̃•̃)۶ ٩(-̮̮̃-̃)۶ૐ

Share this post


Link to post
Share on other sites
spac3m0nk3y

ah ...i see it... this line should be:

$irecord = FileReadLine($hFR)

 

I had to edit

FileWrite($fho,$irecord)

to:

FileWriteLine($fho,$irecord)

Otherwise, everything is written on one line.

Do you know what I would add to get the column headers from the first file into the start of each subsequent one as the first line?

Share this post


Link to post
Share on other sites
Jos

Yea, sorry about the FilexxxLine() mess I made. That happens when you code without testing. :)

Just read the first line and save it in a variable and write that as first line after the FileOpen statement within the loop?

Jos


Visit the SciTE4AutoIt3 Download page for the latest versions  - Beta files                                How to post scriptsource        Forum Rules
 
Live for the present,
Dream of the future,
Learn from the past.
  :)

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  

×