importing seriously denormalized XLS files into A2002

P

pietlinden

Sorry if this is a stupid question, but here goes.

I'm working for a company that collects survey data in Excel. As a
result, very little is normalized. The basic structure of a standard
"survey" spreadsheet is like this:

<Demographic Data Columns><Survey Data Columns>

Sometimes the Survey data has multiple columns for what would normally
be a combobox in Access. For example, a question might have one of
several responses (Never, Sometimes, Always), which appears as separate
columns with an X in the chosen column.

I was thinking I might have to write something that at least handles
the demographic data import by creating a table of (XLColumnName,
CorrespondingDBColumn) pairs so that I could create a query on the fly
that would insert the data...

is this a totally insane way to go about this? Does anybody have any
suggestions? (I know, chuck the whole thing and use a Web UI... that's
what I'm pushing for, but this is data they already have.)

Thanks for any pointers - web sites, books, whatever you can suggest is
fine!

Pieter
 
M

Matthias Klaey

Sorry if this is a stupid question, but here goes.

I'm working for a company that collects survey data in Excel. As a
result, very little is normalized. The basic structure of a standard
"survey" spreadsheet is like this:

<Demographic Data Columns><Survey Data Columns>

Sometimes the Survey data has multiple columns for what would normally
be a combobox in Access. For example, a question might have one of
several responses (Never, Sometimes, Always), which appears as separate
columns with an X in the chosen column.

I was thinking I might have to write something that at least handles
the demographic data import by creating a table of (XLColumnName,
CorrespondingDBColumn) pairs so that I could create a query on the fly
that would insert the data...

is this a totally insane way to go about this? Does anybody have any
suggestions? (I know, chuck the whole thing and use a Web UI... that's
what I'm pushing for, but this is data they already have.)

Thanks for any pointers - web sites, books, whatever you can suggest is
fine!

Pieter

well .. I guess you are stuck with writing custom code that translates
the Excel data into something that more resembles a database. The only
alternative seems to be to convince the customer that s/he needs
another form of input for the surveys. A comparison of costs would
perhaps be helpful in this situation.

HTH - not much :-(
Matthias Kläy
 
J

John Vinson

Sorry if this is a stupid question, but here goes.

Not at all. It's a common (and knotty!) problem.
I'm working for a company that collects survey data in Excel. As a
result, very little is normalized. The basic structure of a standard
"survey" spreadsheet is like this:

<Demographic Data Columns><Survey Data Columns>

Sigh.... all too typical.
Sometimes the Survey data has multiple columns for what would normally
be a combobox in Access. For example, a question might have one of
several responses (Never, Sometimes, Always), which appears as separate
columns with an X in the chosen column.

I was thinking I might have to write something that at least handles
the demographic data import by creating a table of (XLColumnName,
CorrespondingDBColumn) pairs so that I could create a query on the fly
that would insert the data...

is this a totally insane way to go about this? Does anybody have any
suggestions? (I know, chuck the whole thing and use a Web UI... that's
what I'm pushing for, but this is data they already have.)

I'm not sure that this would really end up saving time, but it
probably would be worth some effort. You could actually get a start by
linking to the spreadsheet, and using VBA code to loop through the
linked spreadsheet's Fields collection to fill out the records in
XLColumnName.

The tricky part will be that you'll need to RESHAPE the data to
normalize it; it's not clear to me that there will be just a
CorrespondingDBColumn, there will almost surely be a
CorrespondingDBTable as well!

John W. Vinson[MVP]
 
J

John Nurick

Hi Pieter,

Occasionally I have to deal with survey data like that, but so far it's
always come as text files (with far more than the 255 columns allowed in
an Excel sheet). To help, I developed a Perl utility that takes an
arbitrarily wide text file and converts it into a tall narrow one from
which data can easily be imported into a normalised Access structure.

If this sounds helpful, look for "txtnrm.pl" at
http://www.j.nurick.dial.pipex.com/Code/Perl/index.htm
Obviously you'd need to save the Excel data to a text file first
(tab-delimited is convenient).
 
P

pietlinden

Matthias said:
well .. I guess you are stuck with writing custom code that translates
the Excel data into something that more resembles a database. The only
alternative seems to be to convince the customer that s/he needs
another form of input for the surveys. A comparison of costs would
perhaps be helpful in this situation.

HTH - not much :-(
Matthias Kläy

Thanks Matthias, that's what I was afraid of. I wish the data were
denormalized in predictable ways (delimiters inside fields, like
comma-separated), but no such luck. The good (ish) news is that the
company just bought an Access database solution to handle the future
surveys. so at least this kind of disaster won't continue. I guess
once I see enough spreadsheets, I'll have a reasonable idea of field
names to look for and I have code for that - I'll just create a table
of keywords to search for and flag. Then at least I'll know what has
to be done to the various files. Not perfect, but probably the best I
am going to do.
 
K

kenlyle

John Nurick is a rock star

That PERL normalizer reminds me of a Paradox script I wrote some 2
almost years ago....but better

There may be a minor issue in that the last field in each group seem
to break to a new line...for example..

"3",Importance: Biology,"4
"3",Effectiveness: Computer Skills Training,"2
"3",Importance: Computer Skills Training,"4
"3",Effectiveness: Reading,"4
"3",Importance: Reading,"5
"3",Effectiveness: Playgrounds,"2
"3",Importance: Playgrounds,"3
"3",Effectiveness: Sidewalks,"3
"3",Importance: Sidewalk
,"5
"3",Importance: Biology,"4
"3",Effectiveness: Computer Skills Training,"4
"3",Importance: Computer Skills Training,"5
"3",Effectiveness: Reading,"5
"3",Importance: Reading,"4
"3",Effectiveness: Playgrounds,"4
"3",Importance: Playgrounds,"5
"3",Effectiveness: Sidewalks,"4
"3",Importance: Sidewalk
,"4

Maybe it has something to do with the ":" in the data? Anyway, when
use the -1 flag, this wrapping doesn't happen, but the ": " i
replaced by "__", which is no biggie

NICE work

Thanks
Ke


-------------= Posted from Ugroups.com =------------
---= Fast & Free Web Portal to Usenet Newsgroups =--
-------------= http://www.ugroups.com/ =------------
 
J

John Nurick

Ken,

Thanks for the kind words.

Are you able to send me a sample of the data so I can work out where the
linebreaks are coming from? Just remove the bogus middle name from the
email address in the header of my message.

On Sat, 27 Jan 2007 10:52:36 -0600,
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top