The following selection of links is but a tip of an iceberg when it comes to the corpora (text collections) suitable for text analysis. The corpora listed below, however, are compiled by the members of CSG, and checked for compatibility with commonly known stylometric software.
- A Small Collection of British Fiction
- 100 Polish Novels
- 100 English Novels
- 68 German Novels
- 100 Russian Novels
- Latin New Testament
- Roman de la Rose, to play with the Rolling Classify method
Documentation of the package ‘stylo’
- for (real) beginners: a crush introduction in the form of a slideshow
- for (sort of) beginners: a concise HOWTO
- for advanced users: a paper in R Journal
- full documentation at CRAN
Blog posts on non-obvious functions of
- Authorship verification with the package ‘stylo’
- Cross-validation using the function
- Custom distance measures
- Testing rolling stylometry
A list of relevant publications by the CSG members can be found on this website, on the subpage ‘publications‘. However, a comprehensive Stylometry Bibliography, curated by Christof Schöch, is definitely a place to consult before starting any experiment in text analysis.
Learn with us
The members of the group regularly conduct invited workshops at various places of the world, including yearly course offerings at Digital Humanities Summer Institute (DHSI) in Victoria BC and The European Summer University in Digital Humanities (ESUDH) in Leipzig. Below we will aim to regularly add updates on upcoming events.
2019 major workshops
28 Feb - 1 Mar Style / Content – Literary Modeling in Stockholm, Sweden. Taught by Jan Rybicki.
3-5 May Stylometry at DHI Beirut in Beirut, Lebanon. Taught by Jan Rybicki.
10-14 June Digital Humanities Summer Institute in Victoria BC, Canada. Taught by Maciej Eder and Joanna Byszuk.
23 Jul - 2 Aug The European Summer University in Digital Humanities in Leipzig, Germany. Taught by Maciej Eder and Jeremi Ochab.