We have been running different structWSF instances for about two months now. Each instance is hosting different dataset(s) that are queried for different purposes. I think that it worth taking some time starting to analyze the querying stats of two of these instances of the early Alpha version of structWSF.
The goal is to create some kind of checkpoints that we will be able to use in the future to check how the system improved or deteriorated. It is also to check what kind of metrics we could derive from the current logging system, and to check if we could find any bottle neck or issues with any of the endpoints.
The data used to analyze the instance A span from the 2009-06-08 at 7:16:38 to the 2009-08-18 at 12:28:37.
The data used to analyze the instance B span from the 2009-05-20 at 1:46:31to the 2009-08-18 at 12:40:28.
structWSF Instance A
The instance A only has 1 dataset with about 1000 instance records in it. As we can notice bellow, the average time of a query to that instance for all web service endpoints is about 210 milliseconds.
Number of queries |
Average time for each query in seconds |
27956 | 0.218252857656909 |
The table bellow give us the total number of queries sent to each web service endpoint with an average time for each web service.
Web Service | Number of queries | Average time for each query in seconds | ||
dataset_create | 265 | 0.126993534699919 | ||
converter/tsv | 48 | 0.128808428843714 | ||
dataset_update | 17 | 0.140141641392576 | ||
dataset_read | 11780 | 0.144073766884864 | ||
auth_registrar_access | 883 | 0.145781793788779 | ||
converter/bibtex | 49 | 0.149710825511323 | ||
auth_lister | 1970 | 0.159979685066925 | ||
search | 1397 | 0.180938945980523 | ||
browse | 8949 | 0.199636802392004 | ||
crud_read | 638 | 0.241032384406063 | ||
dataset_delete | 263 | 0.420157149717388 | ||
crud_delete | 3 | 0.637878338496 | ||
converter/irv | 792 | 0.661979901670313 | ||
sparql | 715 | 1.123084135322358 | ||
crud_create | 187 | 1.486844727060763 | ||
This table gives the number of queries for each returned HTTP response status code by the endpoint. This kind of metrics is useful to debug potential issues
Web Service | Number of queries | HTTP Response Status | ||
auth_lister | 1968 | 200 | ||
auth_lister | 2 | 400 | ||
auth_registrar_access | 883 | 200 | ||
browse | 8949 | 200 | ||
converter/bibtex | 45 | 200 | ||
converter/bibtex | 2 | 400 | ||
converter/bibtex | 2 | 406 | ||
converter/irv | 740 | 200 | ||
converter/irv | 51 | 400 | ||
converter/irv | 1 | 406 | ||
converter/tsv | 43 | 200 | ||
converter/tsv | 2 | 400 | ||
converter/tsv | 3 | 406 | ||
crud_create | 66 | 200 | ||
crud_create | 116 | 400 | ||
crud_create | 5 | 500 | ||
crud_delete | 3 | 200 | ||
crud_read | 480 | 200 | ||
crud_read | 158 | 400 | ||
dataset_create | 265 | 200 | ||
dataset_delete | 261 | 200 | ||
dataset_delete | 2 | 500 | ||
dataset_read | 11767 | 200 | ||
dataset_read | 9 | 400 | ||
dataset_read | 4 | 500 | ||
dataset_update | 17 | 200 | ||
search | 1393 | 200 | ||
search | 4 | 400 | ||
sparql | 693 | 200 | ||
sparql | 19 | 400 | ||
sparql | 3 | 406 | ||
structWSF Instance B
The instance B has 25 datasets with about 2 312 000 instance records in it. As we can notice bellow, the average time of a query to that instance for all web service endpoints is about 550 milliseconds.
Why the average query time per query double with the size of that instance? It is what we will check.
Number of queries | Average time for each query in seconds | |||
37575 | 0.556303637714566 | |||
The table bellow give us the total number of queries sent to each web service endpoint with an average time for each web service. What we can notice is that the time it takes to create, delete and update records in the database management systems is related to the size of the dataset. So, what happened and is there anything we can do?
Most of the queries used for this analysis come from queries sent to structWSF v.1.0a1 and v1.0a2. However, something that has a major impact on these results changed in v1.0a3 that has been released last week. The big problem with these numbers is Solr’s commit time. In version v1.0a1 and v1.0a2, a Solr commit was issued each time something was updated in the index. Commit could take up to minutes sometimes with the size of its index. Since v1.0a3, we give that choice to the system administrator: he can issue commit each time something change in the index, or setup Solr’s AutoCommit setting properly. That means that we increased the performance of these CUD endpoints by about 95%.
For the SPARQL endpoint, the reason is that it is mostly exclusively used to export data from a structWSF instance. This means that big dump of RDF triples are incurred for each query, which justify the average time per query of 2.1 seconds.
Web Service | Number of queries | Average time for each query in seconds | ||
dataset_create | 173 | 0.09835156953404 | ||
auth_registrar_access | 1135 | 0.114255581658327 | ||
dataset_update | 121 | 0.119028852005636 | ||
dataset_read | 12683 | 0.159165935205064 | ||
crud_read | 8546 | 0.23457546435556 | ||
converter/bibtex | 109 | 0.405608450600873 | ||
auth_lister | 2315 | 0.471687612780759 | ||
search | 2313 | 0.533951056245796 | ||
browse | 9103 | 0.758227908033767 | ||
converter/tsv | 8 | 0.863690733909698 | ||
sparql | 650 | 2.115058046487879 | ||
converter/irv | 166 | 2.681712512510398 | ||
crud_update | 13 | 4.649851157114154 | ||
crud_create | 75 | 11.306954870223277 | ||
dataset_delete | 140 | 27.511527856750207 | ||
crud_delete | 25 | 34.33350466727492 | ||
This table gives the number of queries for each returned HTTP response status code by the endpoint.
Web Service | Number of queries | HTTP Response Status | ||
auth_lister | 2275 | 200 | ||
auth_lister | 11 | 400 | ||
auth_lister | 2 | 406 | ||
auth_lister | 27 | 500 | ||
auth_registrar_access | 1110 | 200 | ||
auth_registrar_access | 25 | 400 | ||
browse | 9084 | 200 | ||
browse | 18 | 400 | ||
browse | 1 | 406 | ||
converter/bibtex | 108 | 200 | ||
converter/bibtex | 1 | 400 | ||
converter/irv | 154 | 200 | ||
converter/irv | 12 | 400 | ||
converter/tsv | 8 | 200 | ||
crud_create | 41 | 200 | ||
crud_create | 33 | 400 | ||
crud_create | 1 | 500 | ||
crud_delete | 24 | 200 | ||
crud_delete | 1 | 400 | ||
crud_read | 8268 | 200 | ||
crud_read | 273 | 400 | ||
crud_read | 5 | 406 | ||
crud_update | 4 | 200 | ||
crud_update | 9 | 400 | ||
dataset_create | 171 | 200 | ||
dataset_create | 2 | 400 | ||
dataset_delete | 79 | 200 | ||
dataset_delete | 61 | 500 | ||
dataset_read | 12647 | 200 | ||
dataset_read | 11 | 400 | ||
dataset_read | 25 | 500 | ||
dataset_update | 113 | 200 | ||
dataset_update | 8 | 500 | ||
search | 2286 | 200 | ||
search | 24 | 400 | ||
search | 3 | 406 | ||
sparql | 618 | 200 | ||
sparql | 22 | 400 | ||
sparql | 6 | 406 | ||
sparql | 4 | 500 | ||
Generating the Stats
Here is the list of SQL query used to create these stat tables. You can run them locally on your structWSF instance to generate the same kind of statistics.
Timespan of the queries
select min(request_datetime) as startdate, max(request_datetime) as enddate from SD.WSF.ws_queries_log;
Get the average number of milliseconds per query sent to the syste
select count(request_processing_time) as nb_queries, avg(request_processing_time) as average_query_time from SD.WSF.ws_queries_log order by ID desc;
Get the average query time for each web service of a structWSF instance.
select requested_web_service, count(request_processing_time) as nb_queries, avg(request_processing_time) as average_query_time from SD.WSF.ws_queries_log GROUP BY requested_web_service ORDER BY average_query_time ASC;
Status messages counts per web service endpoint
select requested_web_service, count(request_http_response_status) as nb_queries, request_http_response_status from SD.WSF.ws_queries_log GROUP BY requested_web_service, request_http_response_status ORDER BY requested_web_service, request_http_response_status;