I
intuitionist
I've built an Excel RTD server for data comming over the network but I
am having performance problems. I have reason to believe that the main
bottleneck is the way in which I handle the data in the RTD server DLL,
and have some questions in this regard.
Excel's RTD API is very simple. MSDN has some very impressive
performance figures for the RTD function, but all of the data
management is left to the programmer. In particular, we observe that
the RTD function in Excel passes topic strings (such as "IBM", "BID")
to the RTD server. But the RTD server is required to return pairs of
integer topic IDs and values as an array. There is a many-to-one
mapping between the integer topic IDs and the topic strings but no
facilities are provided for efficiently handling this mapping. Only
Excel and the RTD server are aware of the topic IDs; the external data
source is only aware of the topic strings. Data arrives at the RTD
server one topic at a time. This data must somehow be stored in memory
so that the new data is ready to be returned when Excel pulls the data
by calling RefreshData(). Only new data needs to be returned by
RefreshData(), so ideally the RTD server should also keep track of what
data data needs to be pushed to Excel and what data has already been
pushed. So the questions are:
1. As updates arrive at the RTD server, one topic at a time, a
one-to-many lookup is needed in order find out which topic IDs
correspond to the topic string. How should this lookup be done, keeping
real-time performance requirements in mind? In particular, what sort of
data structure / algorithim would be most efficient?
2. When Excel pulls the data, an array of topic IDs and new return
values has to be created. How do we keep track of what data needs to be
in this update array, and what data can be excluded because it has
already been sent? How should the array be constructed?
An MSDN article gives an example of an RTD server
(http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnexcl2k2/html/odc_xlrtdvbnet.asp)
The article has a subsection titled "Keeping Track of RTD Data" but
unfortunately the example revolves around generating random data. As
such, the author never has to worry about the mapping between topic
strings and topic IDs. No look-ups betweens topic strings and IDs is
ever done. The example also stores the data as a Collection, instead of
an array - this using a Collection is really scalable? Also, the
example returns the whole collection each time RefreshData() is called
even though in reality is unlikely that every single cells needs to
updated each time Excel calls RefreshData(); wouldn't this rapidly
become a performance problem as we try to scale up the system?
Any suggestions are appreciated. Thanks.
am having performance problems. I have reason to believe that the main
bottleneck is the way in which I handle the data in the RTD server DLL,
and have some questions in this regard.
Excel's RTD API is very simple. MSDN has some very impressive
performance figures for the RTD function, but all of the data
management is left to the programmer. In particular, we observe that
the RTD function in Excel passes topic strings (such as "IBM", "BID")
to the RTD server. But the RTD server is required to return pairs of
integer topic IDs and values as an array. There is a many-to-one
mapping between the integer topic IDs and the topic strings but no
facilities are provided for efficiently handling this mapping. Only
Excel and the RTD server are aware of the topic IDs; the external data
source is only aware of the topic strings. Data arrives at the RTD
server one topic at a time. This data must somehow be stored in memory
so that the new data is ready to be returned when Excel pulls the data
by calling RefreshData(). Only new data needs to be returned by
RefreshData(), so ideally the RTD server should also keep track of what
data data needs to be pushed to Excel and what data has already been
pushed. So the questions are:
1. As updates arrive at the RTD server, one topic at a time, a
one-to-many lookup is needed in order find out which topic IDs
correspond to the topic string. How should this lookup be done, keeping
real-time performance requirements in mind? In particular, what sort of
data structure / algorithim would be most efficient?
2. When Excel pulls the data, an array of topic IDs and new return
values has to be created. How do we keep track of what data needs to be
in this update array, and what data can be excluded because it has
already been sent? How should the array be constructed?
An MSDN article gives an example of an RTD server
(http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnexcl2k2/html/odc_xlrtdvbnet.asp)
The article has a subsection titled "Keeping Track of RTD Data" but
unfortunately the example revolves around generating random data. As
such, the author never has to worry about the mapping between topic
strings and topic IDs. No look-ups betweens topic strings and IDs is
ever done. The example also stores the data as a Collection, instead of
an array - this using a Collection is really scalable? Also, the
example returns the whole collection each time RefreshData() is called
even though in reality is unlikely that every single cells needs to
updated each time Excel calls RefreshData(); wouldn't this rapidly
become a performance problem as we try to scale up the system?
Any suggestions are appreciated. Thanks.