EACL Newsletter

Issue 13

December 2010

Table of Contents
  1. Editorial
  2. Report on CLIN 20
  3. Report on KONVENS 2010
  4. Report on PROPOR 2010
  5. Report on SEPLN 2010
  6. Report on TALN 2010


Welcome to the second issue of the EACL newsletter for year 2010. With this issue, EACL is launching a new initiative: a special yearly issue of the newsletter entirely devoted to the conferences and meetings of the regional accademic associations in Europe dealing with Computational Linguistics. These issues report on regional events that will certainly be of interest to the European computational linguistics community.

In this issue we deal with major regional events in Computational Linguistics held in 2010 and involving European associations. Gertjan van Noord reports on the 20th meeting of Computational Linguistics in the Netherlands (CLIN 20). Sabine Schulte im Walde reports on the 10th Conference on Natural Language Processing (KONVENS 2010). Alfonso Ureña reports on the 26th Annual Conference of the Spanish Society for Natural Language Processing (SEPLN 2010). António Branco reports on the 9th International Conference on the Computational Processing of Portuguese (PROPOR 2010). Finally, Anne Vilnat reports on the 17th Conference on Natural Language Processing (TALN 2010).

Toni Martí and Giorgio Satta, editors

Report on CLIN 20

20th meeting of Computational Linguistics in the Netherlands
February 5th, 2010
Utrecht, The Netherlands

The anniversary 20th meeting of Computational Linguistics in the Netherlands (CLIN for short) took place on February 5th, 2010, in Utrecht. The first CLIN meeting was also organised in Utrecht, at the end of 1990, and since then the CLIN meeting has been organised each year. From the beginning, the annual one day meetings have maintained a low threshold, to ensure a wide participation. A short abstract suffices, and reports on ongoing research is explicitly allowed. As a consequence, the number of registered participants is over 100 in recent years, and the number of submitted abstracts is over 50. In more recent meetings, a number of sessions is organized in parallel, and moreover a subset of the submissions is accepted as poster presentations. In order to make participation more attractive, a well-known researcher from abroad is usually invited for a keynote lecture. This structure has made it possible that there is one day a year on which 'all' computational linguists of the Netherlands and Flanders meet.

For the CLIN 20 meeting in Utrecht, 55 submissions were accepted for a short presentation, and an additional 20 submissions were accepted as poster presentations. The scientific program was completed by a keynote presentation of Bernardo Huberman (HP Labs) on 'Social Attention in the Age of the Web'. This presentation turned out to be a very interesting lecture with lots of amusing examples (the sheets are available from the conference website), although for some, the link with ongoing research in computational linguistics may have been a bit remote.

As the number of papers and presentations suggest, it is hard to give an objective overview of the topic or the quality of all these contributions. Objectivity is further complicated by the fact that I was involved in a number of presentations.

Some of the presentations reported ongoing work in the creation of annotated corpus data for Dutch. Orphee de Clercq and Martin Reynaert reported on their work in the construction of the SONAR reference corpus of written Dutch. In a related presentation, the Dutch Parallel Corpus was presented (which has recently been completed) which will be equally important. Researchers from Ghent reported on the construction of a balanced named entity corpus for Dutch. Finally, Erik Tjong Kim Sang presented some new techniques to browse the LASSY syntactically annotated corpus of Dutch. These presentations illustrate the investments that have been made during the last few years for Dutch language and speech technology through the STEVIN programme, funded by both the Flemish and Dutch governments. The availability of these various new corpora may turn out to be very important for the next decades of (Dutch) computational linguistics.

A number of presentations that I was interested in included a presentation by Morante Vellejo and Daelemans which focused on the task to find the scope of negation and hedge cues in biomedical texts. In at least four other presentations, further progress was reported on the use of distributional similarity techniques for lexical semantics. Related to this was an interesting presentation by Erwin Marsi and Emiel Krahmer in which they attempt to find semantic similarity at the level of phrases and sentences.

A new initiative at CLIN is the STIL thesis prize, which is awarded during CLIN in a short ceremony. The STIL (Stichting Toepassing Inductieve Leertechnieken) Prize is a prize for the best MA thesis describing high quality research in computational linguistics and its applications. The prize was awarded to Maarten van Gompel for his thesis Phrase-based memory-based machine translation. It was appropriate that van Gompel also was the first author of one of the regular CLIN presentations.

The program as well as the abstracts of the presentations is available on-line at http://www.clin.nl/20.

Gertjan van Noord,
University of Groningen, The Netherlands
CLIN working group

Report on KONVENS 2010

10th Conference on Natural Language Processing
September 6-8, 2010
Saarbrücken, Germany

The biennial Conference on Natural Language Processing KONVENS ('Konferenz zur Verarbeitung natürlicher Sprache') is organised in turn by the scientific societies DGfS-CL (German Linguistic Society, Special Interest Group on Computational Linguistics), the GSCL (German Society for Computational Linguistics and Language Technology), and the ÖGAI (Austrian Society for Artificial Intelligence). The conference originally had a focus on research in the German-speaking countries but in between the vast majority of contributions is in English, and the participants are international.

The 10th KONVENS was held at Saarland University in Saarbrücken, Germany, September 6-8, 2010. The local organisers were Manfred Pinkal, Ines Rehbein, and Magdalena Wolska from the Department of Computational Linguistics and Phonetics. Due to the central theme of this year's KONVENS, 'Semantic Approaches in Natural Language Processing', a main focus of the contributions was on linguistic aspects of meaning, covering both deep and shallow approaches to semantic processing, and foundational aspects as well as applications. The proceedings of the KONVENS 2010 are available electronically from Saarland University Press ('universaar') at http://www.uni-saarland.de/de/campus/service-und-kultur/universaar/monographien.html.

With 52 submissions from 11 countries and approx. 80 participants, the KONVENS 2010 was one of the most successful ever. Furthermore, the atmosphere during the conference and the feedback of the participants were both excellent. The invited talks (given by Anette Frank from the University of Heidelberg, Germany, on 'Large-Scale Fine-Grained Named Entity Classification: Can we characterise the meanings of Named Entities and their classes distributionally?'; Ed Hovy from the ISI, University of Southern California, US, on 'Toward a Theory of Statistical Semantics'; and Gerhard Weikum from the Max-Planck-Institut für Informatik, Saarbrücken, Germany, on 'From Information to Knowledge: Harvesting Entities and Relationships from Web Sources') were all highly interesting and complemented each other very well. The remaining program included 10 oral presentations, and 14 poster presentations, whose overall quality was appreciated. The poster presentations were preceded by short 5-7 minute talks, in order to provide an overview of the topics covered, and to improve the navigation of those interested. Last but not least, the KONVENS successfully included contributions by young researchers without compromising for quality, and the medium size of the conference seemed to encourage intensive discussions among the participants.

As well as the conference itself, a demo tour, the social events, and the satellite workshop on 'Language technology and text-technological methods for E-Learning' (organised by Maja Bärenfänger and Maik Stührenberg, and hosting the two invited speakers Iryna Gurevych from the Technical University Darmstadt, Germany, on 'Wikulu: Information Management in Wikis Enhanced by Language Technologies' and Erik Duval from the Katholieke Universiteit Leuven, Belgium, on 'The Snowflake Effect in learning and research') were also well-attended and highly appreciated.

Sabine Schulte im Walde,
University of Stuttgart, Germany
Speaker of the Special Interest Group on Computational Linguistics (SIG-CL)
of the German Linguistic Society (DGfS)

Report on PROPOR 2010

9th International Conference on the Computational Processing of Portuguese
April 27-30, 2010
Porto Alegre, Brazil

PROPOR is the Portuguese verb meaning 'to propose'. It is also the acronym of the international conference on the computational processing of the Portuguese language, the sixth language with the largest number of native speakers in the world, spread over the American, African, Asian and European continents.

In terms of mobilization of a research community, the PROPOR series of conferences plays the role of what in other communities is the attribution of some formal associative structure. It is the reference focal point of the computational linguistics research community, from both margins of the Atlantic and from all over the world, whose work has a focus on Portuguese.

The PROPOR conferences have been held every other year, alternating between Portugal and Brazil. They feature the full range of initiatives of a scientific event of its kind, with oral and poster presentations, demo sessions, plenary keynote presentations, tutorials, best dissertation awards, etc. Its selected full papers are published in Springer's Lecture Notes in Artificial Intelligence, with a very competitive acceptance rate, scoring 27% in the last edition (LNAI 6001, http://www.springerlink.com/content/978-3-642-12319-1/). It is also an occasion where old acquaintances meet again and new friends are made.

As its acronym suggests in the Portuguese reading, the PROPOR conference is the place to propose new ideas and lines of action. That was what happened once more in the last edition, which took place in April 27-30, 2010, in Porto Alegre, Brazil (http://www.inf.pucrs.br/~propor2010), and which was organized by Vera Strube de Lima (General chair), António Branco (Language chair), Aldebaro Klautau (Speech chair), Thiago Pardo (Editorial chair), António Teixeira (Demos), David de Matos (Best dissertation contest), Maria das Graças Volpe Nunes (Tutorials) and Renata Vieira (Local organization).

This edition gathered over one hundred participants. The main program was preceded by two tutorials, 'Automatic Speech Recognition: from the beginning to Portuguese Language', by André Adami, and 'Fundamental and New Approaches to Statistical Machine Translation', by Lucia Specia; and complemented with a demo session with 22 demos on display. The main program itself featured 36 presentations, of which around 3/4 were from the language track and 1/4 from the speech track. The keynote plenary speech was delivered by Robert Dale, with the title 'Referring Expression Generation: Is it Getting Easier?'

The next PROPOR will be held in 2012, in Portugal, in the historic and charming town of Coimbra, and is being prepared by Fernando Perdigão (General chair).

António Branco,
University of Lisbon, Portugal
PROPOR 2010 Programme Chair

Report on SEPLN 2010

26th Annual Conference of the Spanish Society for Natural Language Processing
September 7-10, 2010
Valencia, Spain

The Spanish Society for Natural Language Processing (SEPLN: Sociedad Española para el Procesamiento del Lenguaje Natural, http:\\www.sepln.org) organizes every year a three-day conference consisting of paper presentations, posters, ongoing research projects, and demos.

The aim of the conference is to provide a forum for discussion of the latest research work and developments in the field of NLP among the scientific and business communities. The conference also aims at showing new possibilities of real applications and R&D projects in this field. Moreover, there is the intention of identifying future lines of research according to the needs of the market.

The 26th edition of the Annual Conference of the SEPLN took place at the Polytechnic University of Valencia (Spain), September 7-10, 2010. This year, the conference was included into the framework of the Spanish Computer Science Conference (CEDI: Congreso Español de Informática, http://cedi2005.ugr.es/2010/). The Conference Chair was Lidia Moreno (Polytechnic University of Valencia) and the chairs of the Program Committee were Ferran Pla and Antonio Molina (Polytechnic University of Valencia). In this edition, the number of papers submitted was 55. The program committee consisting of 33 reviewers from 7 countries selected 40 papers to be presented. This gives an acceptance rate of 72%.

The main topics of the conference were: Corpus linguistics; Morphological, syntactic, semantic and pragmatic analysis; Development of linguistic resources and tools; Linguistic, mathematical and psycholinguistic models of language; NLP evaluation systems; Computational Lexicography and Terminology; Word Sense Disambiguation; Monolingual and multilingual text generation; Speech synthesis and recognition; Machine translation; Monolingual and multilingual information extraction and retrieval; Question answering systems; Text summarization; Machine Learning in NLP; Semantics, pragmatics and discourse; Sentiment analysis; Opinion Mining.

Three satellite workshops were organized during September 6 and 7, listed below.

Alfonso Ureña,
Universidad de Jaén, Spain
President of the Spanish Society for Natural Language Processing

Report on TALN 2010

17th Conference on Natural Language Processing
July 19-23, 2010
Montréal, Canada

The 17th Conférence sur le Traitement Automatique des Langues Naturelles (Conference on Natural Language Processing) took place in Montréal from July 19th to 23rd 2010. It has been established that every four years the annual conference would take place outside France (in a French-speaking country), but this was the first time the TALN conference had been organized in North America, and we can now say that it was a great success.

TALN'10 has been organized by the Université de Montréal and by the École Polytechnique de Montréal, under the auspices of ATALA (Association pour le Traitement Automatique des LAngues) and has been held jointly with the Conference for junior researchers RECITAL’10 and followed by the DEFT workshop (DÉfi Fouille de Texte, i.e. Text Mining Challenge), and by the TALS (Traitement Automatique de la Langue des Signes, i.e. Sign Language Processing) workshop. We thank all of the organizers for the high scientific quality of the conference as well as for the cordial welcome. To begin with the organization, we can say that all the participants appreciated the coffee and lunch breaks: the smile and quality of food proposed by 'Fourchettes et Cie' (a community cafeteria and food bank) participated to the atmosphere of the conference!

Concerning the scientific program, the Conference consisted in oral presentations of research work (long papers in the proceedings, about 40% of the 90 submissions), positions papers, posters (short papers in the proceedings concerning on going research, about 50% of the 68 submissions) and invited talks and demos selected by a special committee. As for the student session, 17 papers were submitted, 5 (29%) have been accepted for an oral presentation, and 6 for a poster presentation (35%).

The official language of the Conference is French. However, papers in English have been accepted and presented by participants who were not native speakers of French. For TALN'10, the international program committee (19 people from Europe and 12 from Canada) has been helped by a scientific committee (59 from Europe, 18 from North America, 2 from Australia and 1 from Japan). TALN'10 co-presidents were Michel Gagnon et Philippe Langlais: many thanks for their work!

This year the committee decided to proceed with two separate calls for long papers and short papers, so as to allow people to directly submit short papers, or resubmit rejected long papers as short papers.

The independent RECITAL program committee consisted of 27 members (13 from Europe and 14 from Canada). RECITAL co-presidents were Alexandre Patry and Aurélien Max.

The program has been organized with an invited speaker at the start of each day (with Igor Mel'cuk, Pierre Isabelle and Gerald Penn), followed by a plenary session and parallel sessions. Poster sessions were preceded by 'booster sessions' to allow each author to introduce her/his poster. I don't know whether this is the explanation of the overall success of these sessions, but they were really a place for exchanging technical information with the authors.

The student presentations were placed at the end of the oral sessions, allowing them to benefit from the audience of these sessions.

Concerning conference attendance, about 170 persons registered, 90 from France, 50 from Canada, and the others from different countries located in Europe, North America and North Africa. This is a great success for a conference organized in North America, and thus much more expensive for European people. 15 students benefited from a financial grant from ATALA to facilitate their traveling and staying.

The social events gave the opportunity to lot of people for discussions, and ... discoveries! We have to thank CRIM (Centre de Recherche Informatique de Montréal, founded by a group of companies and universities) for the welcome cocktail they had offered at the beginning of the Conference. Taking a yellow school bus to go to the dinner at Chambly was an experience for non-American people! And tasting Canadian beer was another experience ...

The demo session closed the main conference, and it has been a real success.

On the last day, the workshops took place: DEFT and TALS. DEFT has been organized by Dominic Forest (EBSI, U. Montréal) and Cyril Grouin (LIMSI-CNRS, Orsay). The workshop presented the results of the evaluation campaign on diachronic and diatopic variations. TALS has been organized by UQAM and dedicated to Sign Language, with a special attendee of deaf persons.

The entire organizing committee deserves congratulations (all the 'red' and 'blue' T-shirts), with a special thanks to Lynn da Sylva.

Anne Vilnat,
LIMSI-CNRS, University Paris-Sud XI, France
Vice-president of ATALA in charge of TALN Conferences