How to RANDOMLY split a whole dataset into two sub-dataset?

Z

zencaroline

Hi,

At your possible convenience, might anyone please kindly answer my
question?

How to RANDOMLY split a whole dataset (n=2000) into two sub
dataset (n=1000, n=1000) in SPSS or Excel?

Thank you very much.

Please take care

Caroline
 
D

DaveL

Transform --> Compute.

Create a new target variable (say "uni") then under numeric expression
type "univariate(1)". Press OK
This will have created a variable called "uni" that will be as long as
your data (n=2000) randomly assigning each one a value between 0 and
1. As it is a uniform distribution there should be n=1000 with a "uni"
score of <=0.5 and n=1000 for a "uni" score of > 1000.


If anyone has a better way i too would be interested

Cheers

DaveL
 
C

Chad

The routine I use is as follows:

Function Sample(IP_Data, Nsamp, nobs, Replace As Boolean, ObsNum As Boolean)
Dim i As Double, j As Double, k As Double, n As Double, skip As Boolean
Dim p As Double, Out(), Cnt As Double, pick As Double, Samp As Double

'******************************************************************
'** Function returns a simple random sample from an input array.**
'** Arguments: **
'** IP_data = input data array **
'** Nsamp = number of samples **
'** nobs = number of observations in each sample **
'** Replace = if TRUE, sample with replacement. If FALSE **
'** sample is returned without replacement. **
'** ObsNum = if TRUE, the number of the observation is **
'** included in the first column **
'******************************************************************

With Application
n = UBound(IP_Data, 1)
p = UBound(IP_Data, 2)
Samp = 0

If Replace Then
'Sample with replacement
If ObsNum Then
'Include observation number column
ReDim Out(1 To nobs * Nsamp, 1 To p + 1)
For i = 1 To nobs
For j = 1 To Nsamp
Cnt = Cnt + 1
pick = .Ceiling(Rnd(1234) * n, 1)
Out(Cnt, 1) = pick
For k = 1 To p
Out(Cnt, k + 1) = IP_Data(pick, k)
Next k
Next j
Next i
Else
'No observation number column
ReDim Out(1 To nobs * Nsamp, 1 To p)
For i = 1 To nobs
For j = 1 To Nsamp
Cnt = Cnt + 1
pick = .Ceiling(Rnd(1234) * n, 1)
For k = 1 To p
Out(Cnt, k) = IP_Data(pick, k)
Next k
Next j
Next i
End If
Else
'Sample without replacement
If ObsNum Then
'Include observation number column
ReDim Out(1 To nobs * Nsamp, 1 To p + 1)
For j = 1 To Nsamp
For i = 1 To nobs
Cnt = Cnt + 1
skip = False
pick = .Ceiling(Rnd(1234) * n, 1)
'Check for obs in sample
If Cnt > 1 Then
For k = 1 To Cnt
If pick = Out(k + Samp, 1) Then
'Found repeat
skip = True
Cnt = Cnt - 1
i = i - 1
Exit For
End If
Next k
End If
If Not skip Then
Out(Cnt + Samp, 1) = pick
For k = 1 To p
Out(Cnt + Samp, k + 1) = IP_Data(pick, k)
Next k
End If
Next i
Samp = Samp + nobs
Cnt = 0
Next j
Else
'No observation number column
ReDim Out(1 To nobs * Nsamp, 1 To p)
For j = 1 To Nsamp
For i = 1 To nobs
Cnt = Cnt + 1
skip = False
pick = .Ceiling(Rnd(1234) * n, 1)
'Check for obs in sample
If Cnt > 1 Then
For k = 1 To Cnt
If pick = Out(k + Samp, 1) Then
'Found repeat
skip = True
Cnt = Cnt - 1
i = i - 1
Exit For
End If
Next k
End If
If Not skip Then
For k = 1 To p
Out(Cnt + Samp, k) = IP_Data(pick, k)
Next k
End If
Next i
Samp = Samp + nobs
Cnt = 0
Next j
End If
End If

End With

Sample = Out
End Function
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top