S
smurray444
Hi there - thanks for looking,
I have a large dataset with various columns of data for global climate
(meteorological) stations, however the dataset is not complete - it has
gaps for some years where data has not been collected/the equipment
failed.
I am hoping to be able to interpolate the data in order to fill these
gaps... via a simple linear method. For example, if there are three
cells as shown below, then the gap in cell B can be filled by taking
the average of cells A and C to equal 50 etc etc.
A = 40
B = [GAP] (=50 when interpolated/averaged)
C = 60
There are also incidents where there are a series of gaps in
succession, eg. as follows:
A = 10
B = GAP
C = GAP
D = GAP
E = 50
I understand that it might be difficult to fill the gaps whereby the
values increment in turn (ie. = 20, 30, 40 respectively in the example
above) - but would it be possible to design an automated solution
whereby the missing data is filled by doing, say, (50+10)/2 = 30... and
'30' is used to fill the gaps from B-D (there are not necessary 3 blanks
in each case, this ranges considerably).
The final piece of the jigsaw is in averaging the values correctly when
the climate station changes and there are gaps for both stations in
succession, as demonstrated in the simplified example below:
Station 1: A = 10
Station 1: B = 15
Station 1: C = 20
Station 1: D = GAP (=25)
Station 1: E = 30
Station 1: F = GAP (=35)
Station 1: G = GAP (=40)
Station 2: A = GAP (=4)
Station 2: B = GAP (=5)
Station 2: C = 6
Station 2: D = 7
Station 2: E = GAP (=8)
Station 2: F = 9
Station 2: G = 10
In the case of Station 1: F-G and Station 2: A-B, an average is not
required of the values above and below the gap present (ie. not taking
the average of 30 and 6 and filling each of the gaps with 13), but is
required only for the data in that particular station. So would there
be a way of informing the to be wary of such changes in station
reference? (Station reference number is in column A in my dataset)
Is there a way of designing (a macro in Visual Basic?) to automate
these procedures in Excel 2002? - as doing it manually would be
unfeasible due to the nature of the dataset size. Incidently, data
which requires interpolation is in columns F to M (inclusive), and is
represented as a blank cell. Interpolation is required vertically down
the columns, and not between rows.
Thanks very much for looking at this - I would be very grateful if
someone manages to manufacture a solution.
Thanks once more,
Steve Murray
I have a large dataset with various columns of data for global climate
(meteorological) stations, however the dataset is not complete - it has
gaps for some years where data has not been collected/the equipment
failed.
I am hoping to be able to interpolate the data in order to fill these
gaps... via a simple linear method. For example, if there are three
cells as shown below, then the gap in cell B can be filled by taking
the average of cells A and C to equal 50 etc etc.
A = 40
B = [GAP] (=50 when interpolated/averaged)
C = 60
There are also incidents where there are a series of gaps in
succession, eg. as follows:
A = 10
B = GAP
C = GAP
D = GAP
E = 50
I understand that it might be difficult to fill the gaps whereby the
values increment in turn (ie. = 20, 30, 40 respectively in the example
above) - but would it be possible to design an automated solution
whereby the missing data is filled by doing, say, (50+10)/2 = 30... and
'30' is used to fill the gaps from B-D (there are not necessary 3 blanks
in each case, this ranges considerably).
The final piece of the jigsaw is in averaging the values correctly when
the climate station changes and there are gaps for both stations in
succession, as demonstrated in the simplified example below:
Station 1: A = 10
Station 1: B = 15
Station 1: C = 20
Station 1: D = GAP (=25)
Station 1: E = 30
Station 1: F = GAP (=35)
Station 1: G = GAP (=40)
Station 2: A = GAP (=4)
Station 2: B = GAP (=5)
Station 2: C = 6
Station 2: D = 7
Station 2: E = GAP (=8)
Station 2: F = 9
Station 2: G = 10
In the case of Station 1: F-G and Station 2: A-B, an average is not
required of the values above and below the gap present (ie. not taking
the average of 30 and 6 and filling each of the gaps with 13), but is
required only for the data in that particular station. So would there
be a way of informing the to be wary of such changes in station
reference? (Station reference number is in column A in my dataset)
Is there a way of designing (a macro in Visual Basic?) to automate
these procedures in Excel 2002? - as doing it manually would be
unfeasible due to the nature of the dataset size. Incidently, data
which requires interpolation is in columns F to M (inclusive), and is
represented as a blank cell. Interpolation is required vertically down
the columns, and not between rows.
Thanks very much for looking at this - I would be very grateful if
someone manages to manufacture a solution.
Thanks once more,
Steve Murray