@JPB,
Thank you so much for your reply.
I think more explanation might help to understand the issue. So, I mention the Number of articles in each Corpus:
Corpus X = 10 articles
Corpus Y = 1800 articles
Corpus Z = 1400 articles
TOTAL = 3210 articles
If we consider the total number of correct/incorrect answers in all corpora (instead of the accuracy in each corpus), I think we would have more accurate analysis. For example, we have 2998 correct answers (out of 3210 articles) for "Algorithm A" which gives 93% accuracy. But if we only consider the accuracies without considering the number of articles in each corpus, we would have (80%+93%+94%)/3 = 89% accuracy. (That's the effect of our samples sizes, that you mentioned)
I have to remind that the important measure is to find out the Best Algorithm. So, we can consider all corpora as one corpus with 3210 articles.
I am willing to try chi-square, but I really do not know what can be the expected value for each algorithm.
If you need more explanation, please let me know.
Thanks a lot.