Importing Text Files of 100,000+ lines, many blanks

P

Pflugs

I have been exporting model trees as text files from our modeling program
(running on Unix), and then FTP'ing them to my Windows workstation. Because
of the way the program outputs the files, I use a question mark ("?") as a
delimiter. Many times, commas, dashes, spaces, and slashes are used in the
data's columns, so the common delimiters are out.

Sometimes the files contain over 100,000 lines, so I tried using the code
Microsoft suggests to import the lines to multiple sheets. However, I think
the code read the entire file as one line, for it tried to place the entire
file into "A1." Sure enough, when I opened the text file with Notepad, it's
all one line. But, if I open the file with Wordpad, the lines are correctly
displayed. If I save the file and reopen with Notepad, the lines are
correctly displayed. How do I workaround this?

(By the way, I did go in and verify that there were "Chr(10)'s" where
Notepad displayed boxes.)

Also, most of the lines in the text file are blank or contain a handful of
spaces and then a question mark. Is there a way that I can force the code to
only import lines that contain actual data? How fast is this compared to
opening the whole file (assuming it will fit) and running a for loop to
delete the blank lines?

Thanks for your help,
Pflugs
 
J

JLatham

In a 'regular' ASCII text file output by Notepad you will find that the end
of line sequence is 2 bytes, decimal value 13 followed by decimal value 10.
If the 13 does not preceed the 10, then it's not recognized as a new line.
That is CR followed by LF.

I'm thinking if you can define how to terminate each line on the Unix side
you should be able to output so that Excel can input with the code you have.

There's a good write up about it all over in Wikopedia at
http://en.wikipedia.org/wiki/Newline
Even gives some programming 'work-arounds'.
 
H

Hemant_india

hi jlatham
if i'm not wrong
with "open statement" or opena as textstream statement
u can populate a temparrory array for each line
then using split() u can delimit the data as u want which futher can e placed
in excel sheet.
it seems roundabout but it worked with me fine
i actually collected around 20 filess' data in an array and did my
manipulation
 
P

Pflugs

Thanks for your help. I have sent an eaxample of the text file to your email
address. Please note that due to the data's potential sensitivity, I changed
many of the letters and numbers. That should not affect the overall problem.

Thanks again!
Pflugs
 
P

Pflugs

I see. Now, I looked through the VB help files at "Line Input #" and it said
that the command reads one character at a time until it encounters a Chr(13)
or a Chr(13) + Chr(10). I recorded a macro of manually copying the character
from my text file I thought was the problem, and sure enough, it was Chr(10).


Now, I think that most of the lines are delimited by only one Chr(10). I
tested this by opening the file in Wordpad, saving it, opening it in Notepad,
and verifying that the lines were entered correctly (which they were). So,
if I can get Excel to open the file first in Wordpad, save it through
Wordpad, and then use the "Line Input #" command, things work. Is there a
way to program this? Is there a better way that going through Wordpad? I'd
like to keep everything inside Excel as much as possible.

Thanks,
pflugs
 
H

Hemant_india

try this code
changeing as per ur needs
Dim arsec()
Dim txtarray()
arr = 1
ReDim txtarray(arr)
ReDim arsec(7)
arsec(1) = "hsln.sec"
arsec(2) = "hypln.sec"
arsec(3) = "omt.sec"
arsec(4) = "plln.sec"
arsec(5) = "glln.sec"
arsec(6) = "fdln.sec"
arsec(7) = "nscln.sec"
Const ForReading = 1, ForWriting = 2, ForAppending = 3
Const TristateUseDefault = -2, TristateTrue = -1, TristateFalse = 0

Dim fs, f, ts, s

Set fs = CreateObject("Scripting.FileSystemObject")

For n = 1 To UBound(arsec)
mfile = "g:\devr\01\src\" & arsec(n) <=======change
Set f = fs.getfile(mfile)

Set ts = f.OpenAsTextStream(ForReading, TristateUseDefault)
Do While ts.AtEndOfLine <> True
s = ts.ReadLine
s = s & "|" & Left(arsec(n), 3) ' this may not be necessary for u
Debug.Print ts.Line
txtarray(arr) = s
Debug.Print txtarray(arr)
arr = arr + 1
ReDim Preserve txtarray(arr)
Debug.Print s

Loop
ts.Close
Next n
For h = 1 To UBound(txtarray)
Debug.Print txtarray(h)
Next h
'===================
Set ts = Nothing
fs.CreateTextFile "sec.txt" 'Create a file
Set f = fs.getfile("sec.txt")
Set ts = f.OpenAsTextStream(ForWriting, TristateUseDefault)
For n = 1 To UBound(txtarray)
Debug.Print txtarray(n)
s = txtarray(n)
ts.writeLine s & vbCr
Next n
ts.Close

End Sub
 
H

Hemant_india

i've created a txt file again in my code
u can palce the file in excel
also u can eliminate empty lines
if not len(array(whatever))<=0
code to put data in excel
end if
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top