Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Proofread and validate errors

This document describes the results of performance tests for WebSpellChecker Web API. The performance and load were tested on the following setup:

  • 100 users accessing the server simultaneously;
  • Cache enabled for spell checking purposes; 
  • Certain hardware and software used (EC2 m5.large instance with 2 CPU and 8 GB RAM); 
  • Number of words to be checked (1K words or 6K characters);
  • Number of spelling and grammar problems in the text (15 grammar problems | 35 misspellings), around 5% of errors;

...

  • The type of

...

  • language used for

...

Testing goal

Our main goal was to observe the response time of text processing and CPU utilization on the server when 100 users send simultaneous requests on various languages to the server with WebSpellChecker v5.6.2. 

Environment and testing tool

...

We have run our tests continuously for each of the languages in the default language group (17 languages). The cache setting was enabled for all sets of tests. The following combinations of text to be checked are used for each language and for each user tier:

  • 1K words (6K characters) with 50 grammar problems, 200 misspellingswith 35 misspellings only;
  • 1K words with 15 grammar problems only;
  • 1K words with 35 misspellings onlywith 15 grammar problems and 35 misspellings.

The measured metrics were the response time and CPU utilization.

Observations

Our observations are presented in the charts below.

Response time and CPU utilization (only 35 misspellings)

The chart below represents response time results aggregated by language when there are only 35 misspellings in text of 1K word size.

Image Added


The chart below represents CPU utilization results aggregated by language and user tiers when there are only 35 misspellings in text of 1K word size.


Image Added


Response time and CPU utilization (only 15 grammar problems)

The chart below represents response time results aggregated by language when there are only 15 grammar problems in text of 1K word size.

Image Added


The chart below represents CPU utilization results aggregated by language when there are only 15 grammar problems  in text of 1K word size.

Image Added

Response time and CPU utilization (35 misspellings and 15 grammar problems)

The chart below represents response time results aggregated by language when there are 35 misspellings and 15 grammar problems in text of 1K word size.

Image Added

The chart below represents CPU utilization results aggregated by language when there are 35 misspellings and 15 grammar problems in text of 1K word size.

Image Added

Summary response time and CPU utilization

The chart below represents response time results aggregated by language for 100 simultaneous users in case of three scenarious: 35 mispellings only, 15 grammar problems only, 35 misspellings and 15 grammar problems.

Image Added

Findings and recommendations

Here are the outcomes and aftermath as well as our advice on hardware and software requirements and notes on performance issues which users may encounter:

Note

During testing we used API requests which contain 1K words per request where 3.5% of words are spelling errors and 1.5% are for grammar errors. On average a person writing on the second language makes around 4%-5% of errors in total. The mechanism of requests distribution and size of each requests for UI-based product are being optimized greatly to ensure smooth experience for users and decrease the load on the servers.

  • The response time and CPU untilization metrics depend a lot on the number of words in the dictionary for spelling check and number of rules for grammar check. The more words and rules are for the language, the higher response time and CPU. For example, Brazillian Portuguese contains over 10 million words in the dictionary, thus, it takes longer to process a request in comparison with other languages. The only exception here is the Ukrainian language, it uses a different spelling check engine which doesn't support multi-threading. As a result it has the worst results.
  • The spelling check function in more light weight in comparison with the grammar checking. Having grammar checking enabled more than doubles the response time and increases the CPU utilization.
  • Under given conditions (size of a text and percentage of errors), when the number of users increases to up 100, one m5 instance entails 99-100% CPU load and a significant increase of response time. Our recommendation for the case when more users are added and CPU load constantly reaches 100% on the machine:
    • upgrade instance type and add more CPUs to it;
    • add one more machine to distribute the traffic (requests) between two or more machines, for example, using load balance and auto-scalling.

If you have any questions or comments regarding the outlined results, please fill free to contact us.