Who is the real author of The Life of Lazarillo de Tormes?

The authorship of The Life of Lazarillo de Tormes and of his Fortunes and Adversities—usually referred to as the Lazarillo de Tormes, or just (and henceforth) the Lazarillo— is a topic that has interested researchers ever since the story was first published. The earliest preserved editions were printed in 1554 in Burgos (Spain), Alcalá de Henares (Spain), Medina del Campo (Spain), and Antwerp (Belgium), although there might be at least two earlier editions yet to be found that complete the phylogenetic tree (figure 1 shows a possible stemma). After a short period of popularity, in 1559 it was added to the Index of forbidden books compiled by the Inquisitor General Fernando de Valdés, and therefore banned from public circulation due to its acid anti-clerical criticism.The text’s religious aspects have been particularly influential in scholars’ attempts to create an accurate profile of the anonymous writer.

The list of possible authors has grown along the years along with the painstaking effort of many researchers who devoted their time, intelligence, and expertise —sometimes even through their entire careers— to this text. A noble and scientific goal has guided them to put an end to the enigma and to unveil the true identity of the author of the Lazarillo. These 400 years of attributions have left us an insane, nearly intractable, amount of bibliography that must be reviewed and studied before dreaming of making a contribution to the state-of-the-art.

Analyzing their works from a data-driven perspective and applying machine learning techniques for style and text fingerprinting, we shed light on the authorship of the Lazarillo. As in a state-of-the-art survey, we discuss the methods used and how they perform in our specific case.
Untitled
Profile-based plots for RAR and PPM compression formats. Heatmap and  dendrogram for the profile-based approach using our  implementation of NCD combined with the (A) RAR and (B) PPM compression formats.

According to the methodology, the most likely author seems to be Juan Arce de Otálora, closely followed by Alfonso de Valdés. The method states that not certain attribution can be made with the given corpus.

Unmasking Lazarillo against each of 6 authors (n=250, k=3). The curve below all the authors is that of Juan Arce de Otálora, the most likely author, followed by that of Alfonso de Valdés.
Unmasking Lazarillo against each of 6 authors (n=250, k=3). The curve below all the authors is that of Juan Arce de Otálora, the most likely author, followed by that of Alfonso de Valdés.

Find this article published in LEMIR.