e-News datasets

This e-News dataset is collected from four newspapers: The New Zealand Herald (www.nzherald.co.nz), The Australian (www.theaustralian.com.au), The Independent (www.independent.co.uk) and The Times (www.timesonline.co.uk) from UK, on the topic of business, education, entertainments, sport and travel, respectively. Each document of the dataset is labelled manually by skimming over the text and determining the category. In the provided data files, each document is formatted as one line pure text with the first character as the class label (for training data) and punctuations and stop words removed in advance. The detailed information of the dataset is summarised as,

Topic Business Education Entertainments Sport Travel
N. of Documents 227 93 99 118 131
Ave N. of Words 105 90 92 99 103