Studies in Educational Evaluation, 1997, 23, 31-48.
(Based on earlier work presented at Earli congress 1995 Nijmegen, a.o. paper) (intended to be the first part of my dissertation [in Dutch]: 1995 Leren waarderen SVO-project met notenapparaat)

concept



Assessment in historical perspective

Ben Wilbrink

abstract

A historical perspective on assessment is presented, organized around seven themes, and located in the time window between the early middle ages and the end of the 19th century: ways of questioning, the graded school, medieval university examinations, the disputation, ranking and marking systems, examinations and personnel selection, and the possibly facilitating influence of Chinese mandarin examinations on the development of meritocratic assessment. It is concluded that the essential characteristics of current assessment practice were already established by the end of the 19th century, in contrast to the received view on ‘educational measurement’ being a direct descendant of the innovative work of Galton and Binet.




Assessment in historical perspective

Historical study has the power to illuminate current assessment practice, because that practice is for the greater part traditional practice, and it is a powerful instrument to stimulate reflection on assessment. However, a systematic historical treatment of the subject is not available. Some authors do come close, such as Smallwood (1935) in her study on the history of examinations in the U.S.A., Prahl (1974) in his dissertation on the history of examinations in Western Europe's universities, or Hanson (1993) who, from the perspective of the anthropologist, treats critically the place testing has in modern American society, tracing its roots in witch trials and medieval education. The history of assessment has to be assembled from information hidden in many different monographs, school histories, and studies on this or that aspect or period of assessment practice, as will be clear from the references used in this article. The purpose of this article is to facilitate reflection on critical aspects of current assessment practice by tracing their possible roots in history. The search may uncover some unsuspected facts, as for example the existence of an earlier continental analogue of the Mathematical Tripos, the 18th century Cambridge competitive examination.

The sheer age of some assessment traditions shows them to be relatively immune to changes in cultural environment. Indeed, the university as an institution is one of the oldest of the western world, and university examinations are as old as the universities of Bologna and Paris. The concept of school organization in forms (the graded school), and so the concept of curriculum, is only two centuries younger. University examinations and the idea of the school form are obvious subjects for historical analysis. While examinations in the 13th century share characteristics with modern examinations, it is not to be assumed they had the same function and meaning to the actors involved as they have in the 20th. Assessment practice must be studied in its historical context, in order to understand how a particular practice was a solution to problems and tasks as perceived by historical actors.

The reverse case is just as interesting, the solutions of the past still being thought valid in current education even though the original problems have long since ceased to exist. It is quite conceivable that our ineradicable habit of ordering and ranking students is such a solution to a problem that no longer exists, or that it is no longer a legitimate solution to an original and still existing problem.

In this article assessment is to be understood in a generic sense, in contrast to the specific approach known as ‘educational measurement’ which shares its 19th century roots with those of psychological testing. The concept of assessment will be intentionally left undefined, leaving it to the historical analysis to give it shape and content. That it would be misleading to attempt to define assessment on the basis of contemporary practice is illustrated by the fact that in the medieval university the disputation was a prominent part of the examination, with no parallel or analogue in current examinations. We have to settle with ‘family resemblances’ to illuminate the concept of assessment in its historical and current manifestations.


What does it mean to know something ?

Medieval education can be characterized as ‘teaching’ students to learn sacred and other texts by heart. To know something was to know it by heart (Riché, 1989, p. 218). In the early middle ages the texts to be learned were religious texts, and learning took place mostly in monasteries and convents. There was an urgent motivation to learn the Holy Scripture and other religious texts, because doing so made it more likely after one's death to be admitted to heaven. Not only the scarcity of manuscripts forced the monks to learn the scripts by heart; medieval manuscripts being difficult to read one had already to know the text by heart in order to be able to read it (Bolgar, 1954, p. 111). Muslim manuscripts were ambiguous because they consisted only of consonants, therefore the Muslim student had to give proof by recitation that he ‘knew’ the text. Only then would his master authorize him to teach the text (Berkey, 1992, p. 29).

The medieval monk was confronted with a double task: learning Latin grammar in order to be able to learn Latin texts. Meditation, consisting of the recitation of religious texts, was an important activity for the monk. Holy texts, of course, were written in Latin, so one had to study Latin grammar in order to learn to understand and to speak Latin. The study of grammar consisted in the learning by heart of famous grammars dating from the Roman Empire, or simpler textbooks used for beginners. These grammars were written in the style of questions-and-answers, which was a familiar style in antiquity, see in the Bible the questioning of Adam and Eve by the Lord. Memory could use some support, so many manuscripts had illustrations that served as mnemonics. The ‘art of memory’ (Yates, 1966) was practised widely, the Jesuit Matteo Ricci even tried to convince the Chinese of its usefulness in preparation for their exams.

Assessment under these circumstances of necessity took the form of having students recite, answer the questions as posed in the grammar that was used, or question each other. The arts examinations at the medieval universities consisted mainly of very simple questions and answers (Lewry, 1982, p. 116). Questioning and answering was the dominant didactic form in teaching and learning. Knowing the right answers to questions about religious texts was extremely important. Out of this kind of questioning grew the catechism, and in its wake the catechetical method. These archetypes of assessment were still dominant in education as late as the 19th century (Foden, 1989, p. 12). Only in the second half of the 19th century did the American Colleges replace the recitation method by lectures or ‘group discussions.’ The recitation method was a combination of learning and examining, but in the American colonies the examining part was in fact non-existent: “The colonial college student was essentially ungraded and unexamined. (...) public oral examinations were gestures in public relations and therefore not designed to show up student deficiencies” (Rudolph, 1977, p. 145). The first written examinations in Oxbridge in a sense followed the catechetical method, because no questions were put that allowed different interpretations: ‘the way to achieve more accurate and certain means of evaluating a student's work was to narrow the range of likely disagreement and carefully define the area of knowledge students were expected to know’ (Rothblatt, 1974, p. 292).

It is the experience of almost every living adult in developed countries that even today a substantial part of all questioning and assessment in education is recitation and giving the ‘right’ answers to known types of questions. Most standardized tests only count the proportion of ‘right’ answers. The difference between modern and medieval testing seems to be mainly that not the salvation of one's soul but that of one's career depends on producing the right answers. Irony apart, the question-and-answer paradigm deserves the critical looking after it is getting now from many different quarters.


Joan Cele: 14th century originator of Western style education


Wandeling met Sjoerd Karsten in Zwolle, stad van Joan Cele


Important principles of curricular and school organization were developed by Joan Cele, rector of the Latin school of Zwolle, a Hanze town in the Low Countries, in the period ca. 1375 to 1415 (Frederiks, 1960; Codina Mir, 1968). Cele, a famous teacher, had to run a school with 800 to 1000 students in a town with only five thousand inhabitants. Many of these students came from Utrecht, Liège, Flanders, and the German countries. Cele solved the organizational problems posed by the sheer number of his students by imposing a new and strict division of the students in eight classes, as well as the curriculum in eight different forms. Cele hired two Parisian masters in the arts to teach philosophy in the highest two classes. However, most of the students were in the lower classes, learning Latin and its grammar. Still being confronted with classes of up to hundred students, Cele introduced a subdivision in groups of ten students called decuriae. Each group had a leader who was responsible for learning and discipline; leadership was changed every week. Twice a year Cele held examinations for promotion to a higher form. In the lower forms the exam consisted of a recitation to check on the achievement of the task posed in that form; in the higher forms Cele also looked for the student's insight (sententia) into the meaning and message of the Latin texts that were translated.

Cele's innovations were important because his students introduced the new didactic principles in schools all over Europe, among them the university of Paris. The Jesuits, whose Ratio Studiorum was inspired by this didactic modus parisiensis, definitely established this pedagogy in Europe's schools and universities (Codina Mir, 1968). Joan Cele single-handedly established the European model of the graded school, examinations for promotion, and ranking of students on the basis of merit. The historical importance of Cele's innovation was only recently revealed by the work of Post (1954), Frederiks, and Codina Mir. As late as 1960 Philippe Ariès could still present a meticulous study on the evolution of the graded school, being unaware of the source of the innovations lying in Zwolle in the 14th century.

The medieval class contained pupils of different ages, who were yet in the same form for possibly quite different durations. Current school organization is certainly based on the ideas of Cele, but in the 18th and 19th centuries classes came to be constituted bureaucratically according to age and with a fixed duration of stay, grade retention coming to be a consequence of bureaucratic rules, whereas earlier students used to be promoted on the basis of their learning potential (Paulsen, 1921, p. 621; Ingenkamp, 1972, p. 24, 42). The educational philosophy legitimating this modern school organization after the model of the standing armies of the newly formed states, was already formulated by Comenius in the 17th century (Ingenkamp, 1972, p. 16). This evolution in the principle of grouping pupils in classes on the basis of age instead of learning is extremely important, for now pupils came to be assessed in comparison to their peers in age, not in comparison to their peers in learning.


Examinations at the medieval university of Paris

The university of Paris in the middle ages was an organization of masters, in contrast to the university of Bologna that was an organization of its wealthy students. Most of the Parisian masters, however, were masters of arts, being at the same time students in one of the ‘superior’ faculties of law or theology. In the medieval university one first had to study grammar and philosophy in the four-year arts curriculum, only ‘masters of arts’ being admitted to the ‘superior’ faculties of medicine, law or theology. Nobody could be a student in Paris without having a master, so the first thing the newly arrived student had to do was to seek himself a good master (Thorndike, 1944, p. 30). The master was responsible for his students, he saw to it that they spent their time in study and not in idleness, he set daily exercises and heard their recitations. The master put students in competition to each other giving explicit praise to the student with the best achievement of the day, and blaming the student who blundered worst by giving him the cap with earflaps or asinus. Assessment was part and parcel of the daily life of the medieval student. In the early German universities the ‘propaedeutic’ arts examination tested students on questions and answers that were extensively practised in the years before. The level of this arts examination was surpassed by that of many schools, such as the schools of Brethren of the Common Life (Schwinges, 1986, p. 336, 356). To help students in the preparation for their exams there were, already in de 13th century, ‘examination compendia’ available (Lewry, 1982), the same kind of book with questions and answers and model-poems from the civil service examinations that was a selling success in China (Hu, 1984, p. 13).

A major responsibility of the master was to nominate his students for examinations, but only if he deemed them ready. Examinations were public and formal events, failing a candidate was an extremely rare event and even then the reason would be the moral behaviour of the candidate (Schwinges, 1992, p. 235). The candidate was questioned on his knowledge of the prescribed books, he had to deliver a lecture on a text that was only hours before stated to him and he had to take part in a public disputation. The candidate had to give a proof of what the examination would qualify him to do: to lecture.

What really is new and characteristic for the university as a new institution, is the examination by a committee of masters acting on behalf of the representative of the Pope, the chancellor of Paris (see Weijers, 1995, on research on these examinations). “More particularly, [the universities] were the only institutions—and this was one of the great innovations of the medieval university system—;to link teaching and examinations closely together” (Verger, 1992, p. 43). The successful candidate received the ‘licence to teach,’ a certificate that enabled the licentiate to teach anywhere in the Christian world and to attract students. Until the rise of the university, the authority to teach was self-declared or based on a written statement from one's own master, and the license to teach was temporarily given by the local representative of the church. The genesis of the university examination coincides with the loss, about 1200, of absolute autonomy of the individual master. The individual master became dependent on his examining colleagues: only they could recommend his pupil for the ‘licence to teach.’ Another way to describe the introduction of the examination is to say that the chancellor of Paris lost his autonomy in the appointment of university teachers because now there had to be an examination of the candidate by a committee composed of masters of the university. Gradually there grew a distinction between the examination and the appointment as a master. Still later, examinations did qualify one for a certain profession, but did not give entry to that profession because an academic grade was only one of many qualifications, descent and wealth being the more important ones (Moraw, 1992). The new institution and its examinations for the first time in Western history defined what knowledge was, thereby also encouraging the new phenomenon of professionalization (Bullough, 1978).

The university examination was a new institution, having no model in the past, nor in any other country. Webber (1989, p. 36) suggested that the sudden appearance of examinations was influenced by contacts with the Chinese. There are two problems with this hypothesis. At the time, just before Dzenghis Khan established his realm, there were no direct contacts with the Chinese, and Marco Polo went on his journey to China when the examinations were already in place. More important is that the then existing Chinese examinations did not particularly resemble the new university examinations. Another possibility would be that the university examination was copied from practices in higher education in the Muslim world, but there the individual masters were strictly autonomous in licensing their disciples, leading Makdisi (1981) to the conclusion that the organizational forms of the Western universities and their examinations were real innovations.

The methods of lecturing and studying made it necessary to ‘hear’ the lecture series on a particular book more than once, before one had a reasonably sure knowledge of the text and its commentaries. The regulations of the university stipulated the minimum number of times to hear the lecture series on every book in the examination, making repetition a natural characteristic of education in the universities as well as in the schools, and contributing to the very long duration of studies.

Order of merit in the middle ages was based on one's position in society. The right order was extremely important, even in sitting positions at daily lectures; rich students could buy themselves a place in the ‘noble bench.’ Also the order of merit at examinations, the locatus, was first of all an order of social merit (by birth), and was only in second place determined by criteria such as the length of study (the longer the stay, the higher the place) (Rashdall, 1895 i, p. 459; Schwinges (1986, p. 355; 1992, p. 234). Many students, however, did not even have the intention to go for the arts examination. The conclusion is: yes, there was an honours list for every examination, but the place on the list had little or nothing to do with academic merit.

In the medieval university merit in the modern sense of academic achievement was important in daily practice, but was not explicitly recognized in the examinations in the way of a ranking order. The medieval university examination has always been an important model for examinations ever since, but the competitive examination definitely is a later development.


The disputation: a lost examination format

The disputation is the high mark of medieval education. Famous are the disputations between Abelard and William of Champeaux; Abelard describes in his autobiography the flavour of the times and the details of his contests with William (Thorndike, 1944, p. 3). These disputations attracted large numbers of ‘students,’ and marked the beginnings of what would become the university of Paris. There are, of course, many different forms of disputation, and over the centuries there have been important developments in techniques and traditions. A disputation was a major event: all other activities in the university were cancelled so as to give everybody the opportunity to attend. The pièce de résistance of the disputation was a theorem or problem posed by the master who chaired the disputation. The position of the master was to be defended by one of his students (the respondens), and could be opposed by other masters and students. The disputation could last the better part of the day, or even the whole day. The next day the master would give a summary of the arguments pro and contra, and indicate why the opposition failed and what the conclusion or solution (determinatio) of the problem should be. For the respondens participation in the disputation was part of the fulfilment of his examination requirements.

In rare cases the problem posed was a sincere problem eagerly waiting for a solution; here the disputation was a method of finding new secure knowledge. In the middle ages the disputation was the only method to develop new knowledge, and to critically analyze newly translated or discovered theories. In the Muslim world, in the 11th century the disputation was an important instrument in the development of Muslim law, and for that reason an important method in higher education; Makdisi (1981), in good disputational style, posits the primacy of the Muslim disputational form over that of the later European universities. In the development of logic the disputational method was crucial, as described by Kretzmann & Stump (1988, p. 6).

There is an extensive body of literature on the disputation. Many reports have been preserved in the particular literary form of the report as authorized by the master. McDermott (1993) presents in his anthology of the works of Thomas Aquinas a number of quaestiones disputatae. In this anthology there is also a lecture of Thomas; in this lecture one can find the same elements as used in the disputation: arguments and counterarguments, conclusions and refutations. In the field of logic a number of disputations and an introduction to the genre are to be found in Kretzmann, Kenny & Pinborg (1982). Lawn (1993) treats the disputation in medicine and science, shows its essential place in the development of science, and gives some examples. References to the literature can be found in Weijers (1987).

Most of the time, however, the disputations were exercises intended to sharpen the wits of the participants, and as such they were related to the didactic form of questions and answers. Little is known about the role of the disputation in the instructional process, ‘about how students were taught,’ but Perreiah (1984, p. 85) gives details about how ca. 1400 ‘trial disputations’ were delivered: under very strict rules specific to trial disputation, the obligation or the insoluble, and of course under the rules of logic. In the context of trial disputation Perreiah, following Aristotle (Topics 159a 250), speaks explicitly of an instrument to test the knowledge of the participants.

In Jesuit schools the disputation was an instrument to rank students according to merit: the lower ranked student could ‘win’ the rank of his adversary, and vice versa (Compère, 1985, p. 83). Winning or losing was determined by the number of errors made by each contestant. This was also the practice in the Latin school of Sturm (Codina Mir, 1968, p. 173). This kind of ranking by competitive disputations was also known in late Antiquity (Lim, 1995), and in the Muslim world about 1000 (Makdisi, 1981).

The disputation kept a prominent place in university curricula and examinations until the 18th century, when they became more farcical and finally gave way to modern forms of examination in the 19th century. In Leiden, early in the 17th century, disputations took place once in every two weeks, the very rude discussions regularly resulting in a serious scuffle (Schotel, 1875, p. 332). In Oxbridge, in the 18th century students and faculty no longer took disputations seriously, but only halfway the 19th century they disappeared altogether (Rothblatt, 1974) .

The disputation is the only major type of exercise on which one's intellectual agility could be assessed not to survive as such in modern times. The disputation was a public event, and because of that the participants must have been highly motivated to do a good job and to give a good public impression. Assessment in this case was self-assessment as well as assessment by one's public. The disputation has been replaced by examinations in question-and-answer style, yet it might be that there is a modern equivalent to the disputation: scientific research and all the preparation for it that goes into modern secondary and higher education. To be able to do scientific research in the late 20th century demands extensive preparation in mathematics, statistics, discipline-specific research methods, and in the peculiar stylistic scholastics that has developed around reporting and publishing research (e.g. in psychology: Madigan, Johnson, & Linton, 1995). The assessment characteristics are like those of the disputation: reporting is public, and standards for good practice are explicit and objective.


Punishment or reward? Ranking and marking systems

A perennial problem in education is to keep the student's attention on his educational tasks. Punishment is traditionally used to this purpose, often taking the form of punishment for non-disciplinary behaviour. The heads of medieval schools and universities were entitled to punish their students, even for crimes committed outside the school. For the medieval student punishment was a daily routine. In the 11th century Egbert, a teacher in Liège, criticized the harsh punishment in the schools of his day, and 14th century Joan Cele was known to be mild in his punishments (Fortgens, 1956, p. 36; Frederiks, 1960, p. 56). The humanists propagated competition and reward instead of punishment to motivate students (Bot, 1955). Scaglione (1986, p. 13) sees a connection between the emergence of these new ideas and practices and the innovations of Joan Cele; he also points out that in the Renaissance there was an extraordinary eagerness to learn, in contrast to the periods before and after (o.c., p. 93). The influence of the humanists led to a system of prizes for the best students of the class that dominated Western education until deep in the 19th century.

In order to be able to reward the best student one should know who he is, and one should have some rules to rank students for this purpose. The prize mechanism led to bookkeeping of points or notae throughout the academic (half) year, points being earned by good behaviour, or lost by making academic mistakes as well as by bad behaviour. The prize system is a driving force behind the development of systems of points and 19th century marking systems. The schools of the Brethren in the late middle ages already had an elaborate system of ranking students according to merit, examinations being used to determine their ranking. Students could challenge the rank given to them, in which case a contest between the challenger and the next better ranked student was held (Codina Mir, 1968, p. 173). Haskins (1923, p. 74) gives the example, from a 15th century student manual, of the daily disputation held by the master with his own pupils, where a prize as well as a symbolic punishment (asinus) was given for keeps until the following dispute; the same practice existed in 1559 in Calvin's Academy of Geneva. There was then already a practice of keeping a record of earned points or notae. “Classes were divided into decuriae not by age or social rank but by merit and achievement. The decurio supervised all work, and punishment for intellectual sluggishness could take the typical form of nota asini "the ass's mark" or nota sermonis soloecismi, "the mark of bad Latin" (Scaglione, 1986, p. 47).” Centuries earlier, in the Muslim world, the same practice existed of ordering of pupils according to merit (Makdisi 1981, p. 81, 91).

In Jesuit schools competition and ranking by academic merit was the core of the educational program. “The Jesuits, as educators general before modern times, did not formally grade students’ homework or even tests, but by their results they listed the students publicly in order of merit (Scaglione, 1986, p. 74).” From the 17th century lists have survived where every student at the end of the school year was graded according to his achievements and capacities. (Compère, 1985, p. 83).

There have always been objections to the prize system. In the middle ages Italian parents objected to the leniency of the system: they preferred punishments. A frequently stated objection was that the many students who would never be able to earn a first or second prize were in fact neglected by this system of rewards. Also there were objections against certain moral problems in the wake of the competition for prizes: fraud, malicious delight, stress, and lying.

In England, during the latter part of the 18th and the first half of the 19th century, the university climate grew competitive, written examinations replacing the orals, and candidates being ranked according to achievement on lists of ‘honours’ candidates that were made public. Low achieving candidates could hide their shame by taking a ‘pass', in which case they were not ranked and their names were not made public. At Cambridge the participants in the Mathematical Tripos were until 1910 ranked according to achievement, the best achievement being honoured with the title of Senior Wrangler, the least one with a title as well as a man-sized attribute: the Wooden Spoon. Competitive examinations in Oxbridge, in the early 19th century, put the students under great pressure (Rothblatt 1982). Competitive examinations were also known on the continent, where already in the 17th century at Leuven there was fierce competition between students from its four colleges, called pedagogies (Vanpaemel, 1986, p. 33). Leuven also had its own variant of Cambridge's classes, called lineas; the best candidate was called the primus and he was highly honoured. The great pressure on students that Rothblatt mentions manifested itself at Leuven already before 1675. The resemblance between the examinations in Leuven and Cambridge does seem to have escaped the attention of historians.

Exactly why and how ranking systems were in the 19th century replaced with marking systems is not known, but surely the 19th century belief in the power of measurement (Kula, 1986) must have been involved. Ranking of students was in the first half of the 19th century still the dominant practice in secondary education. According to Compère (1985, p. 83), before 1850 there was no marking system in use in France. In early 20th century Germany, class ranking was possibly still in general use: Stern (1920) compared the scores on his new intelligence test with the rank in class, not the marks obtained. In the Netherlands the gymnasium of Groningen was probably the last school to substitute a marking system for the sytem notae and ranking lists, doing so only in 1901 (Van Herwerden, 1947, p. 41). For the United States the history of grading systems in higher education is described by Smallwood (1935). Notwithstanding the replacement of the believed oldfashioned ranking system with the marking system, high marks were still as scarce a good as the first or second place in the class order of merit, because they were artificially made scarce (Deutsch, 1979, p. 393). In England the first case of marking examination papers is found in the Mathematical Tripos of 1836: “Earlier examiners and moderators tended to rely on impression” (Rothblatt, 1982, p. 14).

In the ranking system rank was determined by the summed scores (= notae) of all the students in the form. For that purpose notebooks were kept; in Groningen, for example, every student had a notebook wherein all notae were jotted down, not only those of himself, but also those of all other students (Van Herwerden, 1947, p. 41; Rudolph, 1977, p. 147, for a parallel at Harvard). The notebooks in Western education resemble the Books of Merit and Demerit in China, in the 16th and 17th centuries (Brokaw 1991), but there is probably no link between the two systems. There must be some kind of relation, however, between ranking systems and their replacements, the marking systems that are still used all over the world; knowing that relation might shed some light on the reasons for adopting marking systems.

A short description of the emergence of the marking system in England is given by Rothblatt (1993, p. 44); competitive examinations in Oxbridge demanded objective assessment, and credible objectivity demanded the curriculum to be narrowed so as to be able to assess by using marks. This is an important clue, that marking served purposes of ranking, especially to legitimize the judgments being made of the examination papers, and that curricular content was adjusted to make this kind of assessment possible. In France the marking system seems to have evolved from the ranking system: Chervel (1993, p. 136 ff.) shows how juries for the French concours d'agrégation gradually change a complex ranking procedure into a marking system. Instead of simply ranking the candidates from the worst (number one) to the best achiever (equal to the number of candidates), candidates came to be ranked on a fixed range from one (worst) to ten (best), allowing ties, or breaking ties by using halves. The change was made complete by not using the extreme numbers when the impression was that candidates were not good or bad enough to ‘deserve’ them. Marking systems differ from country to country, while the basic idea underlying them is the same everywhere in the Western world: the system of ranking stripped of its prizes, and pseudo-objectified by evaluating achievement directly on a marking scale. With hindsight, the problem in the new marking systems is the lack of rules or standards that could make the translation from the number of errors to the assigned grade an objective one.


Competition and the state

Modern examinations were formed in the critical period of the late 18th and early 19th century, this formation having much to do with the rise of modern states in Europe. In fact it was state influence that was the crucial factor in most countries, England being a special case because of the autonomous nascence of Oxbridge competitive examinations, and the U.S.A. not yet participating in this process of state formation.

University enrollment in the 17th and 18th century was low and in many countries examinations did not exist any more, or what was called examination was farcical. “All through the 18th century, the examination for the B.A. had been a purely formal ritual of answering standard questions known in advance, and reading a "wall lecture," so called because the examiners would generally leave during the reading of the lecture. It was purely a formal requirement that the lecture be read and the examiners were not required to judge its quality” (Engel 1974, p. 307). “For most of the 18th century undergraduates and collegiate fellows were bored” (Rothblatt, 1974, p. 247). In continental Europe the general trend in the 17th and especially the 18th century was that the state tried to get a hold on the universities and its examinations in order to control the numbers and qualities of its civil servants (Frijhoff 1992). Where earlier one's family, wealth and relations were decisive to get attractive government positions, now merit was becoming the prime criterion. This did not mean that other factors now became unimportant, or that elite positions were threatened by newcomers (Fischer & Lundgreen, 1975). The importance of merit also did not mean that positions were now in fact open to all talented: the costs involved in reaching competitive positions in education were so high that only the established elites and wealthy merchants could bear them, as was the case in the middle ages also (Schwinges 1986, p. 5). Only the 20th century would see the combination of merit and more equal opportunity.


England

The development of ‘modern’ examinations in England begins already in the first half of the 18th century with the institution of the Senate House examination at Cambridge, later to become the Mathematical Tripos (Gascoigne, 1984). The why and how of this development is unknown, but Rothblatt (1974) presents many relevant facts and interesting speculations. Roach (1971, p. 12) affirms the decisive role the English university examinations played as a model for the civil service examinations that were established in the middle of the 19th century. The pervasive influence of the university examinations is described by Rothblatt (1982, p. 15): “The Oxbridge model was followed in the schools, in military academies, in the system of local examinations and in the various branches of civil services, excepting the Department of Education and the Foreign Office. Different career phases became linked together by the same examinations (...).”


France

Present-day France knows the educational contest, the concours, for entrance to prestigious institutions and colleges; this tradition has its origin in a legate of Louis Legrand, who started a yearly contest between 10 Parisian colleges in 1747 (Palmer, 1985, p. 24). In the later 18th century more examinations began to be used, and in a more stringent manner, for recruitment to technical institutions for the army (Ecole du Génie) and the government (Ecole des Ponts et Chaussées), after the revolution the Ecole Polytechnique, an institution that was much followed after by other European countries. The whole point of the concours is that admission to a grande école, for example, will practically guarantee a prestigious job. In France it was the government that made examinations, for the first time in French history, decisive for many a state career; for this purpose it instituted examinations that did not exist before in this form.


Prussia

Prussian rulers in the 18th century built the most efficient bureaucracy of Europe. They instituted the earliest civil service examinations with the intention to break the monopoly of the aristocracy in high government positions (Prahl, 1974, p. 300). In the 18th century a course preparing for government jobs was instituted next to the traditional faculties of theology, law, and medicine. To regulate numbers there came restrictions, also for the other faculties, taking the form of a final examination of the Gymnasium: the Abitur. In the 19th century students in government tracks and of limited means, the so-called Brotstudenten, were cramming for their state examinations; this group was not sold on to the Humboldtian ideal of the university. Growing numbers of students in the 19th century led to the bureaucratization of state examinations themselves as well, strengthening the natural tendency of Brotstudenten to cram for their exams (McClelland, 1980). In these strong developments taking place in the 18th and 19th century the form and function of assessment in Germany was definitely set.

The characteristic development in the 18th and 19th century is that assessment has become a serious matter. No longer was it only a question of honour to win the prize, now one's future career depended on it. No wonder that competitive examinations were going to dominate the educational scene: assessment now served many other lords and interests besides those of transmission of cultural heritage. Assessment served no longer any didactic purposes, instead it dictated them in the form of the necessity of cramming for narrowly defined examinations. Rothblatt (1982) studied the stress that Oxbridge students experienced in their years of study early in the 19th century. From now on for most students only counted what would ultimately be tested.
Because so much now depends on the outcome of examinations, the pressure is in the direction of kinds of questions that do not divide assessors, and on procedures of counting errors or assigning marks that give the impression of exactness. Assessors now stand on the side of state interests or of the professional association, no longer on the side of the student like the medieval master did. Merit assessment has its price: an objectifying distance between assessors and assessed. Yet, the same meritocratic procedures, once in place, made it possible in the 20th century to really offer educational and career possibilities to the talented from all classes in modern society, even though in the eyes of some this may have been a mixed blessing (Ringer, 1979, voicing this feeling).


Chinese mandarin examinations a model for the West

Imperial Chinese examinations are the first known written examinations in history; they gave entry to civil service and were very selective. They were held once in every few years, in halls dedicated to these examinations. Examinations were thoroughly meritocratic, reflecting the Confucian philosophy on the place of merit in China's hierarchic society. Examinations had different forms and functions in different periods, as documented for the examinations of the Ming and Tsing dynasties, from the 14th century until 1905, by Ho (1962) and Miyazaki (1976). Miyazaki's title, ‘China's examination hell,’ adequately depicts the character of these examinations. The main characteristics of these examinations are the written form, the elaborate measures taken to ensure objective assessment, their literary content based on the Confucian classics, the possibility of open participation to all, chances of success more often than not only one in a hundred, unlimited opportunities to participate again even in high age, and the frequency of once in every three years. Examinations were the most important opportunity to become a civil servant, a function in very high esteem and well paid. With the exception of the reign of the Khan's, who abolished the examinations, examinations played a crucial role in the stability of the empire, curtailing the power of the aristocracy and the military, and legitimizing the favoured position of civil servants.

Changes in the examination ‘culture’ in Europe between the early 18th and the late 19th century were manifold, and in the direction of the chief characteristics of Imperial Chinese examinations and bureaucracy: from oral to written examinations, from inconsequential examinations to explicit selection for civil service, from formal ceremonies to competitive examinations, from small numbers to numbers of participants many times higher than the numbers of available places. The resemblance of the European developments during the Enlightenment with the situation in China in the 18th century adored by many intellectuals in Europe suggests some influence of the Chinese model. During the 18th and 19th century many factors influenced the development towards competitive examinations in Europe, among them the achievement of free trade, a principle that also could be of use in government and education. Among the numerous factors mentioned in the literature, the availability of the model of Chinese civil service examinations deserves special mention. It was widely known in Europe, and examinations modeled after this Chinese examination format were propagated by, for example, Adam Smith in his Wealth of nations; see Têng (1943) and Guy (1963) for details on the way the Chinese model influenced European thinking on examinations and their societal role. In their turn, European examinations influenced developments in Japan; its Meiji government instituted meritocratic civil service examinations after the Chinese model, with strong Prussian influence (Spaulding, 1967; Rohlen, 1983, p. 61).

That the Chinese model might have served as a kind of magnet for developments in Western Europe should strengthen our reflective mood regarding the dominant presence of examinations in our daily life. The Chinese civil service examinations were just what the name suggests: a means for selection of civil service personnel, not an educational system. Imperial China never developed an adequate educational system, although in the Sung period a serious effort was made. The suggestion from the Chinese experience is that a strong examination system threatens the quality and even the existence of the educational system. Selection is not a productive process, for it does not of itself produce qualifications; a society that takes the productivity of its educational system seriously should keep education and assessment and selection in proper balance.


Discussion

This search for possible roots of assessment, superficial as it of necessity must be, nevertheless shows some significant and maybe quite unsuspected facts. The first observation is that, indeed, before the beginning of the 20th century assessment had already developed into the forms and procedures that still characterize it today. This underscores that our ‘assessment culture’ is, for the better or the worse, the legacy of societies that long since have gone. Another conclusion from this historical exercise is that the history of ‘educational measurement,’ going back to Galton and Binet, is surely not the history of assessment. Assessment itself was seen to be a complex concept that could be analyzed in terms of its content, its context of the graded curriculum, its descent from medieval university examinations, its instrumental quality to motivate students, its uses in (societal) selection as an instrument of the state, and as strengthened in its meritocratic character by the example of China's mandarin examinations. Still, some aspects had to be left out, such as how medieval masters and students use their time and what their attitude towards work is (Van den Hoven, 1996), or the period immediately before the rise of the universities (Jaeger, 1994).

This article might give rise to more questions than it answers, in which case it would fulfil its intention to stimulate reflection on assessment. The historical facts in this article concern educational systems primarily serving the upper classes of society, while education in the 20th century is mass education, even extending to mass higher education. Why then would knowledge of the roots of assessment be relevant for understanding current assessment practice? The fascinating observation is that assessment procedures handed down by tradition were in this century rather uncritically adopted in mass education, possibly leading to major inefficiencies in education and for too many students a lack of quality of school life.



Note. The research for this paper was partly subsidized by the Netherlands Foundation for Educational Research (SVO) in The Hague, grant number 94707.


References

Ariès, Ph. (1960). L'enfant et la vie familiale sous l'ancien régime. Paris: Plon.

Berkey, J. (1992). The transmission of knowledge in medieval Cairo. A social history of islamic education. Princeton: Princeton University Press.

Bolgar, R. R. (1954). The classical heritage & its beneficiaries. Cambridge, at the University Press.

Bot, P. N. M. (1955). Humanisme en onderwijs in Nederland. Utrecht: Het Spectrum.

Brokaw, C. J. (1991). The ledgers of merit and demerit. Social change and moral order in late imperial China. Princeton: Princeton University Press.

Bullough, V. L. (1978). Achievement, professionalization, and the university. In J. IJsewijn, & J. Paquet (Eds.). The universities in the late middle ages (p. 497-510). Leuven, at the University Press.

Chervel, A. (1993). Histoire de l'agrégation. Contribution à l'histoire de la culture scolaire. Paris: INRP Editions Kime.

Codina Mir, G. (1968). Aux sources de la pédagogie des Jésuites; le ‘Modus Parisiensis.’ Roma: Institutum Historicum S.I.

Compère, M. M. (1985). Du collège au lycée (1500-1850). Généalogie de l'enseignement secondaire français. Parijs: Gallimard/Julliard.

Deutsch, M. (1979). Education and distributive justice: some reflections on grading sytsems. American Psychologist, 34, 379-401.

Engel, A. (1974). Emerging concepts of the academic profession at Oxford 1800-1854. In Stone, L. (Ed.).The university in society. Vol I Oxford and Cambridge from the 14th to the early 19th century (p. 305-351). Princeton: Princeton University Press.

Fischer, W., & Lundgreen, P. (1975). The recruitment of administrative personnel. In Tilly, C. (Ed.). The formation of national states in western Europe (p. 456-561). Princeton: Princeton University Press.

Foden, F. (1989). The examiner. James Booth and the origins of common examinations. Leeds: School of Continuing Education.

Fortgens, H. W. (1956). Meesters, scholieren en grammatica; uit het middeleeuwse schoolleven. Zwolle: Tjeenk Willink.

Frederiks, J. (1960). Ontstaan en ontwikkeling van het Zwolse schoolwezen tot omstreeks 1700. Een historische studie. Zwolle: Tijl.

Frijhoff, W. (1992). Universities: 1500-1900. In Clark, B. R., & Neave, G. R. (Eds.). The encyclopedia of higher education (II, p. 1251-1259). Oxford: Pergamon Press.

Gascoigne, J. (1984). Mathematics and meritocracy: the emergence of the Cambridge Mathematical Tripos. Social Studies of Science, 14, 547-584.

Guy, B. (1963) The Chinese examination system and France, 1569-1847. In Besterman, T. Studies on Voltaire and the eighteenth century, vol. 25, 741-778. Geneva: Institut et Musée Voltaire.

Hanson, F. A. (1993). Testing testing. Social consequences of the examined life. Berkeley: University of California Press. online

Haskins, Ch. H. (1923/1957) The rise of the universities. London: Cornell University Press. site

Ho, P. T. (1962). The ladder of success in imperial China. Aspects of social mobility, 1368-1911. New York: Columbia University Press.site

Hu, C. T. (1984). The historical background: examinations and control in pre-modern China. Comparative Education, 20, 7-26.

Ingenkamp, K. (1972). Zur Problematik der Jahrgangsklasse. Weinheim: Beltz

Jaeger, C. S. (1994). The envy of angels. Cathedral schools and social ideals in Medieval Europe, 950-1200. Philadelphia: University of Pennsylvania Press. site

Kretzmann, N., & Stump, E. (Eds.) (1988). The Cambridge translations of medieval philosophical texts. Volume one: logic and the philosophy of language. Cambridge: Cambridge University Press. site

Kretzmann, N., Kenny, A., & Pinborg, J. (Eds.) (1982). The Cambridge History of Later Medieval Philosophy. From the rediscovery of Aristotle to the disintegration of scholasticism 1100-1600. Cambridge: Cambridge University Press. site [later edition 1988]

Kula, W. (1986). Measures and men. Princeton: Princeton University Press. site

Lawn, B. (1993). The rise & decline of the scholastic ‘quaestio disputata’ with special emphasis on its use in the teaching of medicine and science. Leiden: Brill. site

Lewry, O. (1982). Thirteenth-century examination compendia from the faculty of arts. In Les genres littéraires dans les sources théologiques et philosophiques médiévales (p. 101-116). Louvain-la-Neuve.

Lim, R. (1995). Public disputation, power, and social order in late antiquity. Berkeley: University of California Press. site

Madigan, R., Johnson, S., & Linton, P. (1995). The language of psychology: APA style as epistemology. American Psychologist, 50, 428-436. pdf

Makdisi, G. (1981). The rise of colleges: institutions of learning in Islam and the west. Edinburgh: Edinburgh University Press. site {see also Abdul Haq Compier, 2011, How Europe came to forget about its Arabic heritage pdf

McClelland, Ch. E. (1980). State, society, and university in Germany 1700-1914. Cambridge: Cambridge University Press. site

McDermott, T. (Ed.) (1993). Thomas Aquianas, Selected philosophical writings. Oxford: Oxford University Press. site

Miyazaki, I. (1976). China's examination hell. New York: Weatherhill. site

Moraw, P. (1992). Careers of graduates. In H. de Ridder-Symoens (Ed.). A history of the university of Europe. Volume I, Universities in the middle ages (p. 244-279). Cambridge: Cambridge University Press. site

Palmer, R. R. (1985). The improvement of humanity. Education and the French revolution. Princeton: Princeton University Press.

Paulsen, F. (1921/1960). Geschichte des gelehrten Unterrichts auf den deutschen Schulen und Universitäten vom Ausgang des Mittelalters bis zur Gegenwart. Vol. II. Berlin. online

Perreiah, A. R. (1984). Logic examinations in Padua circa 1400. History of Education, 13, 85-103.

Post, R. R. (1954). Scholen en onderwijs in Nederland gedurende de middeleeuwen. Utrecht: Het Spectrum.

Prahl, H. W. (1974). Abschlussprüfungen und Graden. Sozialhistorische und ideologiekritische Untersuchungen zur akademischen Initiationskultur. Dissertation Universität Kiel.

Rashdall, H. (1895) The universities of Europe in the middle ages. Edited by F. M. Powicke en A. B. Embden (1936). Oxford: at the Clarendon Press.

Riché, P. (1989). Ecoles et enseignement dans le Haut Moyen Age. Fin du Ve siècle - milieu du XIe siècle. Paris: Picard.

Ringer, F. (1979). Education and society in modern Europe. Bloomington: Indiana University Press.

Roach, J. (1971). Public examinations in England 1850-1900. Cambridge: Cambridge University Press.

Rohlen, T. P. (1983). Japan's high schools. Berkeley: University of California Press.

Rothblatt, S. (1974). The student sub-culture and the examination system in early 19th century Oxbridge. In Stone, L. (Ed.). The university in society. Vol I Oxford and Cambridge from the 14th to the early 19th century (I, p. 247-303). Princeton: Princeton University Press.

Rothblatt, S. (1982). Failure in early nineteenth century Oxford and Cambridge. History of Education, 11, 1-21.

Rothblatt, S. (1993). The limbs of Osiris: liberal education in the English-speaking world. In Rothblatt, S., & Wittrock, B. (Eds.). The European and American university since 1800. Historical and sociological essays (p. 19-73). Cambridge: Cambridge University Press.

Rudolph, F. (1977). Curriculum. A history of the American undergraduate course of study since 1636. San Francisco: Jossey Bass.

Scaglione, A. D. (1986). The liberal arts and the Jesuit college system. Amsterdam: Benjamins.

Schotel, G. D. J. (1875). De academie te Leiden in de 16e, 17e en 18e eeuw. Haarlem: Kruseman & Tjeenk Willink.

Schwinges, R. C. (1986). Deutsche Universitätsbesucher im 14. und 15. Jahrhundert: Studien zur Sozialgeschichte des alten Reiches. Stuttgart: Steiner.

Schwinges, R. C. (1992). Student education, student life. In De Ridder-Symoens, H. (Ed.). A history of the university of Europe. Volume I, Universities in the middle ages (p. 195-243). Cambridge: Cambridge University Press.

Smallwood, M. L. (1935). An historical study of examinations and grading systems in early American universities. Cambridge: Harvard University Press.

Spaulding, R. M. (1967). Imperial Japan's higher civil service examinations. Princeton: Princeton University Press.

Stern, W. (1920). Die Intelligenz der Kinder und Jugendlichen und die Methoden ihrer Untersuchung. Leipzig: Barth.

Têng, S. (1943). Chinese influence on the western examination system. Harvard Journal of Asiatic Studies, 7, 267-312.

Thorndike, L. (1944). University records and life in the middle ages. New York: Columbia University Press.

Van den Hoven, B. (1996). Work in ancient and medieval thought. Amsterdam: Gieben.

Van Herwerden, P. J. (1947). Gedenkboek van het Stedelijk Gymnasium te Groningen. Groningen: Wolters.

Vanpaemel, G. (1986). Echo's van een wetenschappelijke revolutie. De mechanistische natuurwetenschap aan de Leuvense Artesfaculteit (1650-1797). Verhandelingen van de Koninklijke Academie voor Wetenschappen, Letteren en Schone Kunsten van België, Klasse der Wetenschappen, Jaargang 48, Nr. 173. Brussel: Paleis der Academiën.

Verger, J. (1992). Patterns. In De Ridder-Symoens, H. (Ed.) A history of the university of Europe. Volume I, Universities in the middle ages (p. 35-74). Cambridge: Cambridge University Press.

Webber, C. (1989). The mandarin mentality: civil service and university admissions testing in Europe and Asia. In Gifford, R. (Ed.). Test policy and the politics if opportunity allocation: the workplace and the law (p. 33-60). Dordrecht: Kluwer.

Weijers, O. (1987). Terminologie des universités au XIIIe siècle. Roma: Edizione dell’ Ateneo.

Weijers, O. (1995). Les règles d'examen dans les universités médiévales. In Hoenen, M. J. F. M., Schneider, J. H. J., & Wieland, G. (Eds.). Philosophy & learning. Universities in the middle ages. (p. 201-223). Leiden: Brill.

Yates, F. A. (1966). The art of memory. London: Routledge & Kegan Paul.


Correspondence


Geert Vanpaemel, June 25 1996 (from an email, in Dutch):


I am not aware of any special study on the system of examinations in Leuven. That its competitive system is so unique I did not know. As I have not researched the period before 1650, I do not know where it originates from. It existed already in the beginning of the sixteenth century and has probably been copied from Paris.


There is a paper (licentieverhandeling) on the organisation of the Artes department in the eighteenth century by Cleenewerck de Clayencour, not published, it should be available (typoscript) at Universiteitsarchief, Mgr Ladeuzeplein 21, B-3000 Leuven. The following publications might be relevant:
E. Reusens (1867). Statuts primitifs de la Faculte des Arts. Bulletin de la Commission Royale d'Histoire, 3, 9 (1867), 151-183.
J. Paquet (1970). Statuts de la faculte des Arts de Louvain 1567-1568?, Bulletin de la Commission Royale d'Histoire,136, 179-271.
P. F. X. de Ram (1861). Codex veterum statutorum Academiae Lovaniensis. In J. Molanus: Historiae Lovaniensis Libri XIV, Bruxelles, 2, 944-979 and 1089-1178.


After being invited to write the article for the StEE I did not have the time to follow up the leads Geert Vanpaemel provided me with, regrettably. May 1997 I obtained van Vocht's (1951) four-volume history of the Collegium Trilingue Lovaniense, that could shed some light on the early sixteenth century situation Geert Vanpaemel hinted at. I have not yet found the opportunity to browse the more than 2000 pages for information on the system of examination used. A relative of a forefather of my grandchildren was Maarten van Dorp, Martinus Dorpius, a professor at Leuven, the college ‘De Lelie', who died in 1525. He probably got this position on the basis of excellent examination results. ‘De Universiteit te Leuven, 1425-1975’ mentions the following on the ranking of students within their college:
“Aan het eind van deze reeks examens werden de studenten gerangschikt—locatio heette dat—volgens de behaalde uitslagen. De rangschikking gebeurde volgens vier categorieën of ordines; eerste rang de rigorosi. tweede de transibilies, derde de gratiosi capaces tamen gratiae, afgewezen werden de gratiosi seu refutabiles. Nummer één in de rangschikking werd tot primus uitgeroepen en in triomf door de stad gevoerd.” (p. 90).

[After the close of the series of exams the students were ranked (locatio according to the results obtained. The ranking followed the four categories of ordines; first rank the rigorosi, second the transibilies, third the gratiosi capaces tamen gratiae, while the gratiosi seu refutabiles were failed. The number one in the rank order was acclaimed to be the primus and he was carried through the city in triumph.]

The last sentence suggest that the ‘winner’ was only one person, not the bunch of four winners from the four colleges. This secondary source is not clear on this point. The relative numbers of students passed and students failed are not known form the sources. So much is known, however, that instruction at the Artes department was directed at the high standards that only the best students were able to comply with. Even with this small amount of information on the examinations in the sixteenth century it is evident that there must have been a fierce competition between students, and many students dropping out of it already at the early stages of their study.


Bots, H., I. Matthey en M. Meyer (1979). Noordbrabantse studenten 1550-1750. Tilburg: Stichting Zuidelijk Historisch Contact.
An early, 1555! Brabant student winning the Leuven concours (see de Vocht, mentioned above) is Rogerus (Rutger) Hessels, alias Alardi (Alarts) (Bots, p. 49, 363 #2148):

Macharen. Imm. Leuven in januari 1555 (pedagogie het Varken, pauper). Promoveerde 16-3-1557 tot A.L. als 1e (primus) van 173 candidaten. In 1558 vermeld als student theologie met een beneficium te Macharen. Komt voor op een bursalenlijst van het Standonckcollege uit de jaren 1559-1562. De opsteller daarvan typeerde Hessels als "een talentvol man, één en al vriendelijkheid, maar te veel tuk op gezelschap." Was in 1571 pastoor te Macharen. Werd in dat jaar deken van het district Oss. Vanaf ca. 1578 tot zijn dood pastoor te Grave en deken van het district Cuyk. Overleed ca. 1596. [for sources see Bots e.a. p. 363]
"De volledige Artes-cursus vergde in de 15e en 16e eeuw te Leuven 2 1/2 jaar, later 2 jaar. De tweejarige cursus was opgebouwd uit 9 maanden logica, 8 maanden fysica, 4 maanden metafysica en ethica, waarna 3 maanden volgden voor het repeteren van de behandelde stof. Vier maanden na het begin van de studie moest de Artiest zijn eerste proeve van bekwaamheid leveren, de actus determinantiae. De disputen voor de titel van baccalaureus artium vonden plaats in het begin van het tweede jaar. De Artes-studie werd afgesloten met het licentiaatsexamen in de vorm van een concours, waaraan alle candidaten deelnamen. Daaraan vooraf ging een examen in elk van de vier pedagogiën, dat gericht was op de voorselectie van de beste studenten, de lineales, die in een tweede ronde met elkaar zouden gaan wedijveren. De drie studenten uit elke pedagogie die bij dit vooronderzoek, calamus geheten, de hoogste ogen gooiden, plaatsten zich voor de eerste linie. Degenen die bij de calamus op de 4e tot 6e plaats eindigden, kwamen terecht in de tweede linie. De nummers 7, 8 en 9 vormden de derde linie. De studenten die zich niet voor een van de linies wisten te klasseren, werden als postlineales aangeduid. De resultaten tijdens het slotconcours bepaalden de plaats binnen de linies alsmede de rangschikking van de postlineales. De beste student uit de eerste linie werd uitgeroepen tot primus universitatis. De primus was het voorwerp van uitbundige festiviteiten, zowel te Leuven als in zijn plaats van herkomst."


Onder de Brabantse studenten zijn de volgende primi:
2148 Rogerus Hessels (details zie box hierboven),
2264 Lambertus Hoex (Houckx) primus 17-11-1726,
3348 Petrus de Louw, Den Bosch, primus nov 1588,
3736 Laurentius Nagelmaeker alias Bacx alias Van Westerhoven primus 18-2-1563,
5546 Lambertus Vincent, Grave, primus 12-11-1648,
5575 Godefridus van Vlierden, Den Bosch, primus 18-2-1574,
5747 Johannes van den Warck (Waerck), Breda, primus 27-11-1590,
5915 Goswinus Witte, Hilvarenbeek, primus 14-2-1576,
en in Douai: 1485 Simon Fierlands *ca 1602 Den Bosch

Bots p. 46-47




Previous work


Wilbrink, B. (1995). What its historical roots tell us about assessment in higher education today. 6th European Conference for Research on Learning and Instruction, Nijmegen. Paper: auteur. html

Wilbrink, B. (1995). Leren waarderen: de geschiedenis. Amsterdam: SCO-Kohnstamm Instituut (concept). (SVO project 94707) html

Wilbrink, B. (1995). Leren waarderen: de geschiedenis. Versie met uitvoerig notenapparaat. (SVO project 94707) [html 480kB]




Related issue: if grading really is a form of ranking, then what is a GPA?


This article has articulated how grading really is a form of ranking as it has been developed on the bases of straightforward ranking as practiced in schools throughout Western Europe in modern times. Once in place, a grading system invites one to regard grades as a kind of quantities that may be added, middled, subtracted, etcetera. A GPA seems a natural way to combine grades, no matter how individual grades might be obtained: in what time of year, on what kind of exercise, in which discipline. The next step then is to use GPA in decisions on admissions, tracking, selection. Scores on tests, such as the American Scholastic Achievement Test, may be standardized, yet their fundamental character is that of ranks also. What has changed since the eighteenth or nineteenth century, then, is the complexity of the rankings involved. In former centuries the ranking was based on, for example, the number of errors the students had made in their work. In more recent times, the GPA is a kind of ranking of rankings, possibly quite another kind of procedure than the old way of ranking pupils.

As far as grading may be regarded to be (a form of) ranking, it is possible to precisely analyze its inner workings. This kind of analysis uses the famous Impossibility Theorem by Arrow (1963), a result in social choice theory concerning the impossibility of consistently rankordering a set of choice alternatives. The bite is this: inconsistent ranking of individuals definitely is unfair. The analysis of assessment rankings in terms of Arrow's result is presented in Vassiloglou and French (1982). What is unfair about ranking on the basis of ranks is, for example, that the resulting rank order is dependent on who else is being ranked together with you. The authors present an example from the literature, a ranking of five candidates on the basis of five subrankings, and show that deleting one candidate may change the ranking of the other four relative to each other. Now imagine these candidates applying for admission to Harvard University ..... . The SAT results etcetera might validly indicate merit, yet in the details of the selective process it will be the case that admissions also depend on trivial circumstances.

There are at least two possibilities to escape the unfairness of ranking of rankings: the first is to get rid of the problem itself by replacing simple ranking of performances with assessment of the strengths of performances (French and Vassiloglou, 1986), the second is to inform candidates about the workings of the ranking method that will be used in the coming examination (De Groot, 1970; Van Naerssen, 1970).




Amy N. Langville & Carl D. Meyer (2012). Who's #1? The Science of Rating and Ranking. Princeton University Press. site




Simon French and Marilena Vassiloglou (1986). Strength of performance and examination assessment. British Journal of Mathematical and Statistical Psychology, 39, 1-14. abstract

A. D. de Groot (1970). Some badly needed non-statistical concepts in applied psychometrics. Nederlands Tijdschrift voor de Psychologie en haar Grensgebieden, 26, 360-376. Didakometrisch en Psychometrisch Onderzoek, juni 1970. [Article in English. Partly available in html]

R. F. van Naerssen (1970). Over optimaal studeren en tentamens combineren. Openbare les. html [Tentamen model. English abstract available]

Marilena Vassiloglou and Simon French (1982). Arrow's theorem and examination assessment. British Journal of Mathematical and Statistical Psychology, 35, 183-192. abstract




Other, recent, or recently found, publications on the subject


Copeland, Rita Copeland & Ineke Sluiter (2009). Medieval grammar & rhetoric. Language arts and literary theory, AD 300-1475. Oxford University Press. Leiden - contents

Weijers, Olga Weijers, (1995). La ‘disputatio’ à la Facultédes arts de Paris (1200 - 1350 environ). Esquisse d’une typologie. Turnhout: Brepols.

Olga Weijers (2002). La ‘disputatio’ dans les Facultés des arts au moyen âge. Brepols . [nog niet ingezien]

Jan Spoelder (2000). Prijsboeken op de Latijnse school. Een studie naar het verschijnsel prijsuitreiking en prijsboek op de Latijnse scholen in de Noordelijke Nederlanden (ca. 1585-1876), met een repertorium van wapenstempels. With a summary in English. Proefschrift Katholieke Universiteit Nijmegen. Amsterdam: APA-Holland University Press. samenvatting, tentoonstelling [gezien maar niet gelezen; veiling Burgersdijk & Niermans mei 2009]

Heikki Lempa (2006). Patriarchalism and Meritocracy: Evaluating Students in Late Eighteenth-Century Schnepfenthal. Paedagogica Historica, 42, 727-749.

Paul Black (2001). Dreams, Strategies and Systems: portraits of assessment past, present and future. Assessment in Education: Principles, Policy & Practice, 8, 65-85.

Thomas Sullivan (2000). Merit ranking and career patterns: The Parisian faculty of theology in the late Middle Ages. In William J. Courtenay and Jürgen Miethke: Universities & schooling in Medieval society (pp. 127-163) Leiden: Brill.


June Barrow-Green (1999). ‘A corrective to the spirit of too exclusively pure mathematics': Robert Smith (1689-1768) and his prizes at the Cambridge University. Annals of Science, 56, 271-316.

R. J. Mislevy (1993). Foundations of a new test theory. In N. Frederiksen, R. J. Mislevy and I. I. Bejar Test theory for a new generation of tests. Hillsdale, NJ: Erlbaum.

Hoi K. Suen and Lan Yu (2006). Chronic consequences of high-stakes testing? Lessons from the Chinese Civil Service Exam. Comparative Education Review, 50, 46-65. http://mentalpolyphonics.com/wp-content/uploads/2007/02/suen06.pdf [dead link? 2-2009]

David R. Hubin (1988). The history of the SAT. Submitted as an American History Ph.D. dissertation in 1988 to the University of Oregon.

Engelhard, G., Jr. (Ed.) (1997). Special Issue: History of Modern Psychometrics. Educational measurement: Issues and Practice. Volume 16 # 4 winter 1997.

Judges, A.N. (1969). The evolution of examinations. In Lauwerys, J. A., & Scanlon, D. G. (Hg.) (1969). Examinations. The World Yearbook of Education. London. p. 17-31.

Kleinschmidt, H. (2000). Understanding the Middle Ages. Woodbridge: The Boydell Press.

George F. Madaus and Thomas Kellaghan (1993). Testing as a Mechanism of Public Policy: A Brief History and Description. Measurement and Evaluation in Counseling and Development, 26, april, 6-10. [I have yet to look this one up, does anbody have a digital copy?]

George F. Madaus and Laura M. O'Dwyer, (1997). A Short History Of Performance Assessment. Lessons learned. Phi Delta Kappan, May 1999. http://www.questia.com/googleScholar.qst?docId=5001256493

Marguerite M. Clarke, George F. Madaus, Catherine L. Horn and Miguel A. Ramos (2000). Retrospective on educational testing and assessment in the 20th century. Journal of Curriculum Studies, 32, 159-181.

Joan L. Richards (1988). Mathematical visions. The pursuit of geometry in Victorian England.. Academic Press.



Searby, Peter (1997). A history of the University of Cambridge. Volume III, 1750-1870. Cambridge: Cambridge University Press. Atheneum, 19-11-97.
Covers the period of developement of the competitive tripos examinations, giving many details concerning the examinations and developments over time.



Vocht, H. de (1951-1955). History of the foundation and the rise of the Collegium Trilingue Lovaniense 1517-1550. 4 parts. Louvain: Bibliothèque de l'Universitee, Bureau de Recueil.
The early history of a famous Leuven college, not one of the four colleges participating in the competitive Leuven examinations for artes students. Vocht gives many details on the ranking in the Promotion to Master of Arts, of many persons, also in the fifteenth century. He uses E. H. J. Reusens (1869). Promotions de la Faculté des Arts de l'Université de Louvain, 1428-1797 (1st part, 1428-1568); and the Leuven manuscript Promotiones in Facultate Artium Universitatis Lovaniensis ab anno 1500 ad annum 1659; and for example Extracts from the Sextus Liber Actorum Facultatitis Artium (itself now lost; the first (1427-1441), second (1441-1447) and fifth (1508-1511) have (partly) survived)



Ginette Delandshere (2001). Implicit Theories, Unexamined Assumptions and the Status Quo of Educational Assessment. Assessment in Education: Principles, Policy & Practice, 8, 113-133.



Cameron Graham and Dean Neu (2004). Standardized testing and the construction of governable persons. Journal of Curriculum Studies, 36, 295-319.



W. Todd Rogers and Donald A. Klinger (NCME, 2007). Purposes of and Issues with the Provincial Testing Programs in Alberta. pdf



Johann Georg Prinz von Hohenzollern und Max Liedtke (Hrsg.) (1991). Schüler-beurteilungen und Schulzeugnisse. Bad Heilbrunn/Obb.: Julius Klinkhardt. isbn 378150655X.



Laura Meilink-Hoedemaker (ongedateerd). Sollicitaties in Delft en Den Haag, 18, 19 en 20 juli 1741 doc. "brochure (ISBN 90-75806-13-2).



Janet Delve (2003). The College of Preceptors and the Educational Times: Changes for British mathematics education in the mid-nineteenth century



Historia Mathematica 30, 140-172. pdf



Andrew Warwick (2003). Masters of Theory: Cambridge and the Rise of Mathematical Physics. University of Chicago Press.


More material on the mathematical tripos in the Wikipedia html


In the Elibron Classic series a lot of nineteenth century books on mathematics as well as on the examinations themselves have been republished recently. See here. For example:



Lynn Thorndike (1940). Elementary and Secondary Education in the Middle Ages. Speculum, 15, 400-408. pdf JStor



Christopher Stray (2001). The Shift from Oral to Written Examination: Cambridge and Oxford 1700—1900. Assessment in Education: Principles, Policy and Practice, 8, 33-50



L. S. Shulman (1986). Those who understand: Knowledge growth in teaching. Educational Researcher, 15 #2, 4-14. http://www.fisica.uniud.it/URDF/masterDidSciUD/materiali/pdf/Shulman_1986.pdf



Schoengen, M. (1898). Die Schule von Zwolle von ihren Anfängen bis zur Einführung der Reformation (1582). I. Von der den Anfängen bis zu dem Auftreten des Humanismus. Freiburg (Schweiz). html



links

Pictura Paedagogica Online, der Digitalen Bildarchiv zur Bildungsgeschichte. "Der Bestand umfasst derzeit mehrere 10.000 Buchillustrationen aus der Zeit vom Mittelalter bis 1850 sowie historische Postkarten des Zeitraums 1870 bis 1933." site


The Encyclopedia Britannica 1911 on examinations: http://encyclopedia.jrank.org/EUD_FAT/EXAMINATIONS.html


Francis Galton writes on the mathematical tripos in his Hereditary genius, see html [In fact, the whole book is made available there] Especially watch the fantastic figures in the table on page 19 here.



to be researched

In a way, the article itself is a small catalogue of theme's deserving further research.


May, 2006. A crucial historical development in assessment is where, next to assessment as an integral part of the instructional process, a kind of functional assessment develops, in connection with the universal lisence to teach everywhere. It would be a nice thing to have a thorough understanding of what is happening here, and to connect it with other developments in culture and society, especially the end of feodalism and the rise of more or less autonomous cities.
From this moment in history on, assessment is a schizophrenic thing, a beast with two souls, the two souls fighting each other in all possible ways, involving many actors on alle levels in society.
Functionalism itself is not the problem, remember Charles the Great establishing schools because of the need for clerks able to read and write, to say the least. It is the functionalism of this functionalism—the diploma disease—that introduces stresses in education.


Peter K. Bol (1997). Examinations and orthodoxies 1070 and 1313 compared. In Theodore Huters, R. Bin Wong & Pauline Yu (Eds): Culture & state in Chinese history. Conventions, accommodations, and critiques. (29-57) Stanford University Press.

Pierre Bourdieu et Jean-Claude Passeron (1970). La reproduction. Éléments pour une théorie du système d'enseignement. Paris: Les Éditions de Minuit.

J. McK. Cattell (1890). Mental tests and measurement. Mind, 15, 373-381. html

Wainer, H. (1987). The first four milennia of mental testing; from ancient China to the computer age. Educational Testing ServiceResearch report 87-34. 6 pages. [also: The Score, 13, 4-5, 11-13, April, 1990.] [ I have seen neither. ETS does not offer an online copy. Does anyone care to send me a pdf copy?]


May, 2006. It is an intentional omission in the 1997 article to leave out developments in the 20th century, because they are mainly of another kind, connected with the explosion in educational participation, and processes of assessment automation.
An article on these 20th century developments will have to deal with the spectacular development of psychological testing techniques and the way especially American education almost immediately gets infected with these techniques and their accompanying philosophies. Early in the twentieth century the crucial and seemingly irreversible turn taken in assessment is that assessment is a kind of measurement, and it is the better served the less subjective the measurement devices are. This philosophy goes against the grain of extensive experience in education that actors in education are strongly influenced in their behavior by the kind of examinations etcetera that society (politicis) confronts them with.


Ellen Condliffe Lagemann (2000). An elusive science: The troubling history of education research. University of Chicago Press. site

Agnes M. Lathe (1889). Written examinations—their abuse, and their use. Education; a monthly magazine devoted to the science, art, philosophy and literature of education, volume 9, 452-456. OCR of text Reprinted in John A. Laska and Tina Juarez (Eds) (1992). Grading and marking in American schools. Two centuries of debate. Springfield, Illinois: Thomas. contents

Daniel Starch (1916). Educational measurements. New York: Macmillan. 10Mb pdf


The point to make now is the following. Early in the 20th century the testing virus escapes from the psychological laboraties and infects the educational establishment. Philosophy and technology of testing are wholeheartedly taken in by the educational community. Large investments, many people en institutions are involved. What is happening is a lock-in on an enormous scale: educational assessment has become educational measurement using standardized achievement tests, or at least teacher-made tests resembling such standardized tests.
What is a lock-in? The QWERTY-keyboard is an example of a lock-in on a particular keyboard layout, an early pragmatic layout that later on proves to be the final one because all attempts to introduce better layouts fail. The VHS video technology is another example of an inferior technology winning the market against the superior Video 2000 technology developed by Philips (later Philips developed cd-laser-technology, earning huge profits on the patent).

A corollary of the lock-in hypothesis concerning educational measurement is the following. In the course of the 20th century there has been a flurry of research on educational testing, resulting in more efficient testing formats, more refined standards of (achievement) testing, highly sophisticated statistical techniques used in developing and evaluating tests, online testing technology, etcetera. Progressively better techniques and tests have become available, while the funding philosophies of this educational measurement, if they ever have been there in a form more elaborated than the passages quoted above, have stayed as inappropriate to the educational process as ever. It is highly exceptional for educational measurement specialists to rise to the issue, and propose new approaches recogizing the way assessments in education truly function. Names: Popham (US), Van Naerssen (Netherlands, html). Themes: backlash, feedforward. Research:



H. Becker, B. Geer and E. C. Hughes (1968). Making the grade: the academic side of college life. John Wiley. site



James S. Coleman (1990). Foundations of social theory. Harvard University Press. site



James S. Coleman (1-6-1994) (concept) What goes on in school: a student's perspective. paper 3/25/94; rev. 6/1/94. html


See numerous entries, articles etc. on this website www.benwilbrink.nl



Ansgar Allen (2012 online first). The examined life: On the formation of souls and schooling. American Educational Research Journal abstract

“ The purpose of this article is largely rhetorical. It seeks to demonstrate that even a relatively quick survey of the development of modern examination must cast doubt on contemporary efforts to ameliorate its effects. Specifically, it seeks to break down the current tendency in education to adjudicate between good and bad examining practices, between those examining techniques that are seen as oppressive, impersonal, and excessively mechanistic and those that are celebrated for their flexibility and attention to the needs of the child. It is argued that both summative and formative traditions in assessment help perpetuate in their respective techniques processes of subject formation that have as their object the construction of selves amenable to government.”



Dominique Julia (1994): Le choix des professeurs en France: vocation ou concours? 1700–1850. Paedagogica Historica: International Journal of the History of Education, 30, 175-205. abstract


The ‘agrégation’, installed at the end of the 18th century.



Dominiqu Julia (Ed.) (1994). Aux sources de la compétence professionelle: critères scolaires et classements sociaux dans les carrières intellectuelles en Europe, XVIIe - XIXe siècles. Paedagogica historica, International Journal of the History of Education, 30 #1, 9-459. contents


Gaat over meritocratie, het ontstaan daarvan, maar vooral (wat ik daar al zo vluchtig van heb gezien) hoe de schijn achteraf van meritocratische methoden bedriegt.



Rosalind Pritchard (2006). Trends in the restructuring of German universities. Comparative Education Review, 50, 90112. abstract




George F. Madaus (1988). The influence of testing on the curriculum. In Laurel N. Tanner (Ed.) (1988). Critical issues in Curriculum (83-121). NSSE. [onmiddellijk daarop volgend: Daniel Tanner (1988). The textbook controversies. pp 122-147. [feedforward, backwash, washback] paywalled




abstract




abstract




March 10, 2014 \ contact ben at at at benwilbrink.nl    

Valid HTML 4.01!   http://www.benwilbrink.nl/publicaties/97AssessmentStEE.htm http://goo.gl/PQd1t



Berlin Declaration on Open Access to Knowledge in the Sciences and Humanities html