When is a NULL Not a NULL ?

R

Robert Morley

Of course I read the thread, and I believe it's simply a matter of the
language we're using to express what is essentially the same concept.
Anybody who knows enough to debate the matter already KNOWS what a Null is;
it's simply a matter of the English language not quite having the words to
express it.

When I say "nothing" (or "unknown" or whatever), I don't mean "nothing" as
in a concrete value of "nothing" (empty string, string value of "nothing",
value of 0, etc.), but "nothing" in the sense of data not present. And as
near as I can tell, that's what we're ALL trying to say, we're just not
using the words that you think of them in.

In point of fact, I DO "know 'naught' from 'nothing'", and I suspect the
rest of the people debating the issue do as well. :) My question is, why
are we debating this at all? The OP's question was answered...who cares how
I or you or anybody else conceives of the concept of Null, just as long as
we know for ourselves what it is and how best to use it (or not, as the case
may be)!



Rob
 
C

Craig Alexander Morrison

This is a big statement and we often read something similar to this,
written in one way or another. However, there is absolutely no example,
proof or demonstration behind it to back it or to substantiate it. When
you want to write a statement about something, you must come with a least
a little more in order to support it.

You may wish to start by checking out

"The Final NULL In The Coffin"

http://www.dbdebunk.citymax.com/page/page/1396241.htm

....and the various references there.

--
Slainte

Craig Alexander Morrison
Crawbridge Data (Scotland) Limited
"Sylvain Lafontaine" <sylvain aei ca (fill the blanks, no spam please)>
wrote in message news:[email protected]...
 
S

Sylvain Lafontaine

« It has been demonstrated ad nauseum in database literature. »

Yeah, I know it has been and it's still be demonstrated and discussed as
nauseum in database literature and that quite probably in the foreseable
future that they will keep doing so. I also know that it's very easy to
*prove* that any set of values associated with the null value can be
replaced with the association of this same set of values less the null value
and a flag value.

So what? Are these people who are living in their ivory tower also doing my
work and paying my bills?

People are not so stupid and tend to keep with them things that they find
useful while dropping other things that may be considered by others as
beeing theoritically better but a more of an hindrance than anything else
when you have to use them in the real world.

All this theory might look good on paper but when it comes to design your
interface with the users, that's another story. I'm more concerned with the
later than with the former.

--
Sylvain Lafontaine, ing.
MVP - Technologies Virtual-PC
E-mail: http://cerbermail.com/?QugbLEWINF


Sylvain said:
« "All in all Nulls are a pain and SQL and the products only make it
worse.
I would always *try* to avoid them, Nulls that is (g), with careful
database
design. The way they are implemented from product to product and from
version to version could easily vary. This, markers idea, was a mistake by
Codd, IMHO, and the vendors and ANSI jumped on it and we seem to be stuck
with it for now." »

This is a big statement and we often read something similar to this,
written
in one way or another. However, there is absolutely no example, proof or
demonstration behind it to back it or to substantiate it. When you want
to
write a statement about something, you must come with a least a little
more
in order to support it.

It has been demonstrated ad nauseum in database literature. Date,
Darwen and Pascal for example. The proof is the principles violated and
the logic that cannot be reconciled with nulls, not just isolated
examples.
I could give you a lot of examples where the use of NULL leads to a
clearer
code, with less fuss and easier to read. However, I won't because this is
a
case where personal experience is more important than dogmatic statements.
NULL is like any other tools, you must learn how to use it properly and if
you want don't know about it or don't want to use it, that's your problem,
not the other's.

In my case, I like to have more tools in my toolbox, not less: when the
only
tool you have in your toolbox is an hammer, everything else looks like a
nail and inversely, when all you have to do is to hit a nail, every tool
begins to look like a hammer.

The toolbox analogy implies that you have a choice. Many SQL
implementations make it unreasonably hard to avoid nulls, but that's a
deficiency of SQL, not an advantage of nulls. Not everyone gets to
choose what data model and DBMS to work with.

--
David Portas, SQL Server MVP

Whenever possible please post enough code to reproduce your problem.
Including CREATE TABLE and INSERT statements usually helps.
State what version of SQL Server you are using and specify the content
of any error messages.

SQL Server Books Online:
http://msdn2.microsoft.com/library/ms130214(en-US,SQL.90).aspx
--
 
C

Craig Alexander Morrison

One word: GUI.

I think you'll find that it is in fact a TLA.

And what are you on about anyway?

--
Slainte

Craig Alexander Morrison
Crawbridge Data (Scotland) Limited
"Sylvain Lafontaine" <sylvain aei ca (fill the blanks, no spam please)>
wrote in message news:[email protected]...
 
C

Craig Alexander Morrison

I do so hope you are not saying that the GUI affects the underlying database
structure.

Do you ever feel the need to denormalise a database for the sake of the user
interface?
 
D

David Portas

Sylvain said:
All this theory might look good on paper but when it comes to design your
interface with the users, that's another story. I'm more concerned with the
later than with the former.

No problem then! User interface design has nothing to do with the
logical data model.

--
David Portas, SQL Server MVP

Whenever possible please post enough code to reproduce your problem.
Including CREATE TABLE and INSERT statements usually helps.
State what version of SQL Server you are using and specify the content
of any error messages.

SQL Server Books Online:
http://msdn2.microsoft.com/library/ms130214(en-US,SQL.90).aspx
--
 
S

Sylvain Lafontaine

« Do you ever feel the need to denormalise a database for the sake of the
user interface? »

Yeah, I do when there is a speed problem. A database is not some god living
in its own universe but a piece of software to be used by people not only
for them to enter data at one end but also for them to read data at the
other end.

All these discussions about replacing null values with default values - such
as 0 - and a flag fall under their own weight when you have to interface
with the user in one direction or the other. You can argue ad nauseum that
translating a numerical value into a string representation and vice versa
should not be taken into consideration when designing a database or that any
data pump can make the difference when 0 is a 0 and when 0 is not a 0
because of some other flag; the fact is the database will be used by people
and for people.

Without people, there would be no database, so discussing the theory about
saying if it's better to have null value inside a numerical collection or to
have a 0 which on some occasions will be a 0 but on some other occasions
will not be a 0 because you have chosen to give it two meaning is pointless
to me.

BTW, even theoritically, I really don't see why giving a particular value
inside a collection two different meanings because it will also be used as a
magical value should be considered better then inserting the Null value into
this collection. In both cases, you must take special measures when
manipulating them and using the null value instead of a magical value has
proven to be the more simpler solution in most of the cases.
 
D

David Portas

Sylvain said:
« Do you ever feel the need to denormalise a database for the sake of the
user interface? »

Yeah, I do when there is a speed problem. A database is not some god living
in its own universe but a piece of software to be used by people not only
for them to enter data at one end but also for them to read data at the
other end.

Denormalization has nothing to do with performance. I can only
encourage you to read the authors already mentioned to understand what
normalization and Physical Data Independence mean.

What you say may have relevance to some development environment such as
Access. It has no relevance at all to an RDBMS. The limitations of
Access are not particularly interesting in this context and certainly
won't move us forward in any discussion about nulls. What I am saying
is that the way you design a user interface should not be constrained
by the way you design the database and vice versa.

BTW, even theoritically, I really don't see why giving a particular value
inside a collection two different meanings because it will also be used as a
magical value should be considered better then inserting the Null value into
this collection. In both cases, you must take special measures when
manipulating them and using the null value instead of a magical value has
proven to be the more simpler solution in most of the cases.

Because NULL is not a value and does not behave like one. It is an
obstruction to deriving correct results from the database.

--
David Portas, SQL Server MVP

Whenever possible please post enough code to reproduce your problem.
Including CREATE TABLE and INSERT statements usually helps.
State what version of SQL Server you are using and specify the content
of any error messages.

SQL Server Books Online:
http://msdn2.microsoft.com/library/ms130214(en-US,SQL.90).aspx
--
 
R

Robert Morley

Well, let's start with the statement at the bottom: "This paper should be
considered investigative in character. Further research is required at both
the logical and implementation levels, but we believe that the idea is sound
and implementable."

While it may work well in theory, and may even work well in reality, we
still have to work with the tools we have, not the tools that may or may not
be forthcoming.



Rob
 
R

Robert Morley

Denormalization has nothing to do with performance.

Oh, but it does! I've seen countless instances where denormalizing your
data vastly outperforms your normalized data. In theory, it shouldn't, but
in practice, it sometimes does.
It has no relevance at all to an RDBMS.

You can't state that categorically. Not all RDBMS's have the same
characteristics, so limitations in Access MAY also be present in SQL or may
not, or maybe SQL does better than Oracle in some things, etc.
that the way you design a user interface should not be constrained
by the way you design the database and vice versa.

Here I have to agree with Sylvain...theory and practice are sometimes two
different things. My databases are generally perfectly normalized,
following almost all the rules of normalization (except those which conflict
with each other)...sometimes, however, that leads to very clunky user
interfaces, or so much additional programming in the UI as to not be worth
it. It's not something I generally plan on, but I accept that once in a
blue moon, the underlying structure will be affected by the UI.
Because NULL is not a value and does not behave like one. It is an
obstruction to deriving correct results from the database.

As a general rule, you don't WANT it to behave like a value. After all,
what is a NULL value but an internal value & flag, just like it would be if
implemented externally...except that there's one disadvantage: you have to
do all the work yourself rather than relying on the built-in and highly
optimized routines that already exist in SQL Server itself.



Rob
 
S

Sylvain Lafontaine

Well, if denormalization has nothing to do with performance, then maybe you
should correct the following wikipedia article:
http://en.wikipedia.org/wiki/Denormalization

Second, I have yet to see a database of bills where the total price of all
items for a command before taxes, the taxes rates, the amount of taxes to
pay, any special discount associated with a coupon and the final total price
are not entered directly in the database because this will *denormalize* the
database.

And I'm not talking here only about Access databases but also of
multi-millions dollars database systems on which I have worked in the past
as well.

Physical Data Independence is about the total absence of redundancy; I will
be very surprised if you have absolutely no database with any kind of
redundancy inside but I cannot speak for you on this point.

--
Sylvain Lafontaine, ing.
MVP - Technologies Virtual-PC
E-mail: http://cerbermail.com/?QugbLEWINF


Sylvain said:
« Do you ever feel the need to denormalise a database for the sake of the
user interface? »

Yeah, I do when there is a speed problem. A database is not some god
living
in its own universe but a piece of software to be used by people not only
for them to enter data at one end but also for them to read data at the
other end.

Denormalization has nothing to do with performance. I can only
encourage you to read the authors already mentioned to understand what
normalization and Physical Data Independence mean.

What you say may have relevance to some development environment such as
Access. It has no relevance at all to an RDBMS. The limitations of
Access are not particularly interesting in this context and certainly
won't move us forward in any discussion about nulls. What I am saying
is that the way you design a user interface should not be constrained
by the way you design the database and vice versa.

BTW, even theoritically, I really don't see why giving a particular value
inside a collection two different meanings because it will also be used as
a
magical value should be considered better then inserting the Null value
into
this collection. In both cases, you must take special measures when
manipulating them and using the null value instead of a magical value has
proven to be the more simpler solution in most of the cases.

Because NULL is not a value and does not behave like one. It is an
obstruction to deriving correct results from the database.

--
David Portas, SQL Server MVP

Whenever possible please post enough code to reproduce your problem.
Including CREATE TABLE and INSERT statements usually helps.
State what version of SQL Server you are using and specify the content
of any error messages.

SQL Server Books Online:
http://msdn2.microsoft.com/library/ms130214(en-US,SQL.90).aspx
--
 
D

David Portas

Robert said:
different things. My databases are generally perfectly normalized,
following almost all the rules of normalization (except those which conflict
with each other)

Now you have intrigued me! How can the rules of normalization conflict
with each other? I think you owe us an example please!

--
David Portas, SQL Server MVP

Whenever possible please post enough code to reproduce your problem.
Including CREATE TABLE and INSERT statements usually helps.
State what version of SQL Server you are using and specify the content
of any error messages.

SQL Server Books Online:
http://msdn2.microsoft.com/library/ms130214(en-US,SQL.90).aspx
--
 
D

David Portas

Sylvain said:
Well, if denormalization has nothing to do with performance, then maybe you
should correct the following wikipedia article:
http://en.wikipedia.org/wiki/Denormalization


I don't trust anything I read on Wikipedia so I've no particular
interest in correcting it. Normalization (and therefore
Denormalization) operate on the logical model, whereas performance is
determined entirely by the physical implementation. "Denormalization
for performance" therefore doesn't make sense.

What you mean is that in *your* DBMS or *my* DBMS logical design
sometimes has to be compromised because of some physical design
limitations. That is not and shouldn't be a general rule however. I
have been trying to keep my points non-product specific.

--
David Portas, SQL Server MVP

Whenever possible please post enough code to reproduce your problem.
Including CREATE TABLE and INSERT statements usually helps.
State what version of SQL Server you are using and specify the content
of any error messages.

SQL Server Books Online:
http://msdn2.microsoft.com/library/ms130214(en-US,SQL.90).aspx
--
 
R

Robert Morley

What you mean is that in *your* DBMS or *my* DBMS logical design
sometimes has to be compromised because of some physical design
limitations. That is not and shouldn't be a general rule however. I
have been trying to keep my points non-product specific.

By definition, joins are expensive to perform, regardless of product. For
data modification (insert/update/delete), it's best to have a fully
normalized database; for querying, however, it is often better to have
denormalized data available (in addition to the normalized data) in separate
functionally-read-only tables which are updated intermittently, as often as
required. Why bother fetching something from a different table when you can
have it readily available in the same table? This is the whole concept
behind data warehousing.

In small- to mid-sized apps, this isn't usually a concern, and you're best
off sticking with fully normalized data. In large apps, however, with data
in the hundreds of thousands, or even hundreds of millions of rows, data
warehousing becomes a significant concern.



Rob
 
D

David Portas

Robert said:
By definition, joins are expensive to perform, regardless of product.

Not true at all. Again, you are confusing logical and physical levels.
In Oracle or SQL Server if I run a query that joins tables A and B but
there also exists a materialized view that joins tables A and B then
the query will operate on the data structure that models the view. In
SQL Server's case that structure is a clustered index - exactly the
same clustered index as if I had created a "denormalized" table
containing the same data from A and B. Materialization not
denormalization is the feature at work in that case. By definition it
is the physical model that determines performance, NOT logical features
like joins and normalization.

To take another example among many: given certain database constraints
some semijoin operations may become entirely redundant and can be
optimized out at runtime. Therefore they cannot possibly be expensive.

--
David Portas, SQL Server MVP

Whenever possible please post enough code to reproduce your problem.
Including CREATE TABLE and INSERT statements usually helps.
State what version of SQL Server you are using and specify the content
of any error messages.

SQL Server Books Online:
http://msdn2.microsoft.com/library/ms130214(en-US,SQL.90).aspx
--
 
D

David Portas

Robert said:
One example would be Domain/Key Normal Form (formerly sometimes referred to
as Sixth Normal Form) vs. (the current) Sixth Normal Form. For more info
(albeit, not all that much), see the following wikipedia article:
http://en.wikipedia.org/wiki/Normal_forms

You are quite right and I had forgotten that. You said "perfectly
normalized" and I suppose I just assumed BCNF or 5NF. Thanks.

--
David Portas, SQL Server MVP

Whenever possible please post enough code to reproduce your problem.
Including CREATE TABLE and INSERT statements usually helps.
State what version of SQL Server you are using and specify the content
of any error messages.

SQL Server Books Online:
http://msdn2.microsoft.com/library/ms130214(en-US,SQL.90).aspx
--
 
C

Craig Alexander Morrison

« Do you ever feel the need to denormalise a database for the sake of the
user interface? »

The Users interface is with an application and in turn the application
interfaces with the database.

Even the applications may not see the underlying base tables.
Yeah, I do when there is a speed problem. A database is not some god
living in its own universe but a piece of software to be used by people
not only for them to enter data at one end but also for them to read data
at the other end.

Whilst denormalisation is generally wrong, some people think they are
denormalising when what they are doing is adjusting an over normalised
design, by that I mean the design has been more influenced by global
concerns than the actual problem domain being modelled.

However denormalisation can often result in fast access to garbage, without
taking a whole series of precautions to make up for the failings in the
database design.

As to Nulls, I much prefer the elimination of -most- Nulls, it is fairly
straightforward with careful design

--
Slainte

Craig Alexander Morrison
Crawbridge Data (Scotland) Limited

"Sylvain Lafontaine" <sylvain aei ca (fill the blanks, no spam please)>
wrote in message news:[email protected]...
 
R

Robert Morley

I've found materialized views to be nothing but a pain the ass in SQL...but
that, of course, is another issue not related to the issue at hand.

You're correct, though, that a materialized view would in fact perform as
you say. But when you get right down to it, that's simply denormalization
done by the RDBMS rather than the DBA...sometimes the DBA needs (or just
plain likes) to have more control over the process. :) Then, of course,
there are the times when you want a certain degree of control over the
process (i.e., multiple tables published at different times saved as
backups, or a table that's guaranteed to only be updated at midnight every
day, etc.). Many of these things COULD be done as materialized views,
granted, but I think it would be more effort than it's worth in many cases,
or simply undesirable by the implementing DBA for a host of other reasons.

As for semijoins that get optimized out of existence, I wasn't thinking of
that, but of course you're right there as well. But then you need to take
into account the (admittedly comparatively minimal) time taken by the
optimization process itself. If you have no joins, that portion of the
optimization process is skipped, and thus there's absolutely no time taken
(well, at least for that portion of the process...I seem to remember that
SQL Server goes through huge optimization processes for any query that's
longer than a zero-length string [and maybe even then! <grin>]).

As always, I think it really comes down to what the DBA most wants to do
given the situation at-hand.



Rob
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top