Home page Site map Contact
  

 

The publications database contains papers and technical reports written by XRCE scientists from 1991 up to the present. If the document you are looking for cannot be downloaded in PDF or Postscript format, please write to webmaster@xrce.xerox.com specifying the title, author(s) and reference. Any document downloaded from the publications database may be used for non-commercial purposes only.

  • Year 2006

    Acronym-meaning extraction from corpora using multitape weighted finite-state machines
    Andre Kempe
    Abstract | report.pdf (66.7KB)
    Research Report
    Ref. : 2006/019



    Automatic evaluation of machine translation quality
    Cyril Goutte
    Abstract | MTeval.pdf (76.38KB)
    Presentation at the European Community
    Ref. : 2006/002



    Categorisation in multiple category systems
    Gabriela Csurka, Eric Gaussier, Cyril Goutte, Francois Pacull, Jean-Michel Renders
    Abstract | 2006-012.pdf (144.42KB)
    International conference on machine learning, (ICML) Pittsburgh, Pennsylvania, June 25-29, 2006.
    Ref. : 2006/012



    Lexical entailment for information retrieval
    Stéphane Clinchant, Cyril Goutte, Eric Gaussier
    Abstract | xrce_entailment.pdf (118.33KB)
    European Conference on Information Retrieval, London, UK, 10-12 April 2006.
    Ref. : 2005/064



    Multitape automata with symbol classes
    Florent Nicart, Jean-Marc Champarnaud, Tibor Csaki, Tamas Gaal, Andre Kempe
    Abstract
    11th International Conference on Implementation and Application of Automata, Taipei, Taiwan, 21-23 August, 2006.
    Ref. : 2006/011



    Overview of Generative and Discriminative hybrid models
    Guillaume Bouchard
    Abstract | result.pdf (425.18KB)
    IDIAP 15 Anniversary Workshop, Martigny, Switzerland, September 12-13, 2006.
    Ref. : 2006/027



    SMART: Research Directions
    Nicola Cancedda
    Abstract | iiia06.short.pdf (21.72KB)
    IIIA - International workshop on Intelligent Information Access, Helsinki, Finland, July 6-8, 2006.
    Ref. : 2006/016
  • Year 2005

    A class of rational n-WFSM Auto-Intersections
    Andre Kempe, Jean-Marc Champarnaud, Jason Eisner, Franck Guingne, Florent Nicart
    Abstract | ciaa.pdf (183.11KB)
    CIAA 2005, Sophia-Antipolis, France, June 27-29, 2005.
    Ref. : 2005/021



    A probabilistic interpretation of precision, recall and F-score, with implication for evaluation
    Cyril Goutte, Eric Gaussier
    Abstract | xrce_eval.pdf (168.76KB)
    ECIR 27th European Conference on Information Retrieval,Santiago de Compostela, Spain, 21-23 March, 2005.
    Ref. : 2004/058



    A voice enabled procedure browser for the international space station
    Manny Rayner, Beth Ann Hockey, Nikos Chatzichrisafis, Kim Farrell, Jean-Michel Renders
    Abstract | RaynerEAClarissaACL05.pdf (457.08KB)
    ACL Conference, University of Michigan, 20-25 June, 2005.
    Ref. : 2005/018



    Co-occurence models in music genre classification
    Peter Ahrendt, Cyril Goutte, Jan Larsen
    Abstract | co_occurrence_article.pdf (95.08KB)
    IEEE International workshop on Machine Learning for Signal Processing, Mystic, Connecticut, USA, September 28-30, 2005.
    Ref. : 2005/031



    Contributions à l accès à l information documentaire
    Eric Gaussier
    Abstract | publi.pdf (455KB)
    HDR Report defended at the University Joseph Fourier, Grenoble, France, Dec. 12, 2005
    Ref. : 2005/067



    German compound analysis with wfsc
    Anne Schiller
    Abstract | FSMNLP2005_schiller.pdf (103.15KB)
    Finite State Methods and Natural Language Processing 2005, Helsinki, 1-2 September 2005.
    Ref. : 2005/032



    Hierarchical Part-Based Visual Object Categorization
    Guillaume Bouchard, Bill Triggs
    Abstract
    Appeared in Computer Vision and Pattern Recognition, Volume 1, pp. 710-715.
    Ref. : 2005/071



    Learning from partially labelled data -- with confidence
    Eric Gaussier, Cyril Goutte
    Abstract | xrce_confidence.pdf (912.81KB)
    Proceedings of Learning with Partially Classified Training Data - ICML 2005 workshop, Bonn, Germany, 7 August, 2005.
    Ref. : 2005/030



    Literality based sample sorting for syntax projection
    Bruno Cavestro, Nicola Cancedda
    Abstract | Wclki1005.pdf (114.51KB)
    Cross-Language knowledge induction workshop, "Babes-Bolyai" University, Cluj-Napoca, Romania, 25 July - 6 August, 2005.
    Ref. : 2005/039



    Relation between PLSA and NMF and implications
    Eric Gaussier, Cyril Goutte
    Abstract
    The 28th annual International ACM SIGIR, Conference on Research and Development in information retrieval, Salvador, Brazil, August 15-19, 2005.
    Ref. : 2005/029



    Traduction automatique statistique avec des segments discontinus
    Michel Simard, Nicola Cancedda, Bruno Cavestro, Marc Dymetman, Eric Gaussier, Cyril Goutte, Kenji Yamada, Arne Mauser
    Abstract | final.pdf (143.74KB)
    Traitement Automatique des Langues Naturelles (TALN 2005), Dourdan,France, 6-10 juin 2005.
    Ref. : 2005/001



    Translating with non contiguous phrase
    Michel Simard, Nicola Cancedda, Bruno Cavestro, Marc Dymetman, Eric Gaussier, Cyril Goutte, Philippe Langlais, Kenji Yamada, Arne Mauser
    Abstract | simard05translating.pdf (132.41KB)
    HLT/EMNLP: Human Language Technology Conference/Conference on Empirical methods in natural language processing, Vancouver, Canada, October 6-8, 2005.
    Ref. : 2005/042



    WFSM Auto-Intersection and join algorithms
    Andre Kempe, Jean-Marc Champarnaud, Franck Guingne, Florent Nicart
    Abstract
    FSMNLP 2005 (5th Int. wrokshop on finite state methods in natural language processing, Helsinki, Finland, September 1-2, 2005.
    Ref. : 2005/028
  • Year 2004

    A Geometric view on bilingual lexicon extraction from comparable corpora
    Eric Gaussier, Jean-Michel Renders, Irina Matveeva, Cyril Goutte, Hervé Dejean
    Abstract | 2004_013.pdf (94.16KB)
    42nd Annual Meeting of the Association for Computational Linguistics, Barcelona, Spain, July 25-26, 2004.
    Ref. : 2004/013



    A note on join and auto-intersection of n-ary rational relations
    Andre Kempe, Jean-Marc Champarnaud, Jason Eisner
    Abstract | 2004_045.pdf (305.85KB)
    Eindhoven FASTAR Days, 2004.
    Ref. : 2004/045



    Algorithms for Weighted multi-Tape Automata
    Andre Kempe, Franck Guingne, Florent Nicart
    Abstract | 2004_031.pdf (233.82KB)
    XRCE Technical Research Report 2004/031.
    Ref. : 2004/031



    Aligning words using matrix factorisation
    Cyril Goutte, Kenji Yamada, Eric Gaussier
    Abstract | 2004_015.pdf (88.09KB)
    42nd Annual Meeting of the Association for Computational Linguistics, Barcelona, Spain, July 25-26, 2004.
    Ref. : 2004/015



    Assisting medical annotation in Swiss-Prot using statistical classifiers
    Pavel Dobrokhotov, Cyril Goutte, Anne-Lise Veuthey, Eric Gaussier
    Abstract | 2004_021.pdf (246.29KB)
    International Journal of Medical Informatics 74(2-4):317-324.
    Ref. : 2004/021



    Chart-parsing techniques and the prediction of valid editing moves in structured document authoring
    Marc Dymetman
    Abstract | 2004_029.pdf (207.22KB)
    DocEng, ACM Symposium on Document Engineering, Milwaukee, Wisconsin October 28 - 30 2004.
    Ref. : 2004/029



    Computing the follow automaton of an expression
    Jean-Marc Champarnaud, Florent Nicart, Djelloul Ziadi
    Abstract
    Ninth International Conference on Implementation and Application of Automata, Kingston, Canada, July 22-24, 2004.
    Ref. : 2004/036



    Corpus-Based vs. Model-Based Selection of Relevant Features
    Cyril Goutte, Pavel Dobrokhotov, Eric Gaussier, Anne-Lise Veuthey
    Abstract | 2004_011.zip (1.9MB)
    Proceedings of CORIA04, Toulouse, France, March 10-12, 2004, pp. 75-88.
    Ref. : 2004/011



    Generative vs Discriminative approaches to entity Recognition from label deficient data
    Cyril Goutte, Eric Gaussier, Nicola Cancedda, Hervé Dejean
    Abstract | 2003_079.pdf (126.99KB)
    JADT 2004, 7èmes journées internationales analyse statistique des données textuelles, Louvain-la-Neuve, Belgium, 10-12 mars 2004.
    Ref. : 2003/079



    Morphological analysis and generation: a first step in natural language processing
    Ken Beesley
    Abstract
    Fourth international conference on Language Resources and Evaluation, LREC 2004, Lisbon, Portugal, May 26-28
    Ref. : 2004/026



    NLP Applications based on weighted multi tape automata
    Andre Kempe
    Abstract
    TALN, Fes, Morocco, April 19-22, 2004
    Ref. : 2004/002



    Reducing Cover Subsequential Transducers
    Jean-Marc Champarnaud, Franck Guingne, Georges Hansel
    Abstract
    Descriptional Complexity of Formal Systems. 6th workshop, London, Ontario, Canada, July 26-28, 2004.
    Ref. : 2004/037



    Similarity relations and cover automata
    Jean-Marc Champarnaud, Franck Guingne, Georges Hansel
    Abstract
    To appear in RAIRO-ITA
    Ref. : 2004/024



    Three new algorithms for WMTAs
    Andre Kempe, Florent Nicart, Franck Guingne
    Abstract
    WATA 2004 Weighted Automata: Theory and Applications, Dresden, June 1-5, 2004.
    Ref. : 2004/017



    Typesetting Deseret Alphabet with LATEX and METAPONT
    Ken Beesley
    Abstract
    The 25th Annual Meeting and Conference of TeX Users Group, Xanthi, Greece, Aug 30-Sept 3 2004.
    Ref. : 2004/007
  • Year 2003

    A Probabilistic information retrieval approach to medical annotation in SWISS-PROT
    Pavel Dobrokhotov, Cyril Goutte, Anne-Lise Veuthey, Eric Gaussier
    Abstract | dobrokhotov03probabilistic.pdf (82.95KB)
    Proceedings of Medical informatics Europe (MIE2003), Saint Malo, France, May 4-7, 2003.
    Ref. : 2003/008



    Acyclic Networks maximizing the Printing complexity
    Franck Guingne, Andre Kempe, Florent Nicart
    Abstract | Acyclic-Networks.pdf (295.81KB)
    To appear in the Journal TCS (Theoritical Computer Science)
    Ref. : 2003/071



    Assessing automatically extracted bilingual lexicons for CLIR in vertical Domains
    Jean-Michel Renders, Hervé Dejean, Eric Gaussier
    Abstract
    To appear in \"Lecture Notes in Computer Science\"
    Ref. : 2003/058



    Automatic processing of multilingual medical terminology: applications to thesaurus enrichment and cross-language information retrieval
    Hervé Dejean, Eric Gaussier, Jean-Michel Renders, Fatia Sadat
    Abstract | 2003_004.pdf (134.68KB)
    Artif Intell Med. 2005 Feb;33(2):111-24. PMID: 15811780 [PubMed - indexed for MEDLINE]
    Ref. : 2003/004



    Categorisation de documents PubMed pour l'annotation médicale dans SWISS-PROT
    Cyril Goutte, Pavel Dobrokhotov, Eric Gaussier, Anne-Lise Veuthey
    Abstract | goutte_dobrokhotov03.pdf (179.38KB)
    EGC Conférence, Atelier "Fouille de données et recherche d'informations dans des bases de données multimédia semi-structurées", Lyon, France, January 22, 2003.
    Ref. : 2003/014



    Combining NLP and probabilistic categorisation for document and term selection for SWISS-PROT Medical annotation
    Pavel Dobrokhotov, Cyril Goutte, Anne-Lise Veuthey, Eric Gaussier
    Abstract | dobrokhotov03combining.pdf (383.27KB)
    Proceedings of the 11th International Conference on Intelligent Systems for Molecular Biology (ISMB 2003)
    Ref. : 2003/009



    Creating a web based demo of a finite state morphological analyzer
    Ken Beesley
    Abstract
    Available on http://www.fsmbook.com
    Ref. : 2003/078



    Editing and Authoring: A structural adviser for the XML document authoring
    Boris Chidlovskii
    Abstract | p346-chidlovskii.pdf (131.39KB)
    Pages 203-211 in Proceedings of the 2003 ACM Symposium on Document Engineering, Grenoble, France, November 20-22, 2003.
    Ref. : 2003/070



    Entre syntaxe et sémantique : normalisation de la sortie de l'analyse syntaxique en vue de l'amélioration de l'extraction d'information à partir de textes.
    Caroline Hagege, Claude Roux
    Abstract
    TALN, Batz-sur-Mer,France, 11-14 June 2003.
    Ref. : 2003/025



    MDA-XML : Une expérience de rédaction controlée multilingue basée sur XML
    Guy Lapalme, Caroline Brun, Marc Dymetman
    Abstract
    TALN, Batz-sur-mer, France, 11-14 June 2003.
    Ref. : 2003/024



    Multi-language machine translation through interactive document normalization
    Aurelien Max
    Abstract | EACL2003-EAMT-Max.pdf (1008.29KB)
    EACL Workshop, Budapest, Hungary, 12-17 April 2003.
    Ref. : 2003/017



    Normalization and paraphrasing using symbolic methods
    Caroline Brun, Caroline Hagege
    Abstract
    ACL: Second International workshop on Paraphrasing, Paraphrase Acquisition and Applications, Sapporo, Japan, July 7-12, 2003.
    Ref. : 2003/044



    Problèmes d'intersubjectivité dans l'évaluation des analyseurs syntaxiques
    Salah Ait-Mokhtar, Caroline Hagege, Agnes Sandor
    Abstract
    TALN Conference, Batz-sur-mer, France, June 11-14, 2003.
    Ref. : 2003/039



    Reducing parameter space for word alignment
    Hervé Dejean, Eric Gaussier, Cyril Goutte, Kenji Yamada
    Abstract | Dejean.pdf (41.78KB)
    http://www.cs.unt.edu/~rada/wpt/NAACL/HLT Workshop Building and Using Parallel Texts: Data Driven Machine Translation and Beyond, Edmonton, Canada, May 31, 2003.
    Ref. : 2003/051



    Report on CLEF-2003 experiments: Two Ways of extracting multilingual resources from corpora
    Hervé Dejean, Eric Gaussier, Jean-Michel Renders, Alexei Vinokourov
    Abstract | CLEF-2003.pdf (76.36KB)
    CLEF 2003, Norway, Trondheim, August 21-22, 2003.
    Ref. : 2003/061



    Reversing controlled document authoring to normalize documents
    Aurelien Max
    Abstract | EACL2003-SRW-Max.pdf (551.51KB)
    EACL 11th Conference of the European Chapter of the Association for Computational Linguistics, Budapest, Hungary, April 12-17, 2003.
    Ref. : 2002/055



    Running time complexity of traversing and printing an acyclic automaton
    Franck Guingne, Andre Kempe, Florent Nicart
    Abstract
    CIAA, Santa Barbara, CA, USA, July 16-18,2003. Volume 2759 of Lecture Notes in Computer Science, Springer Verlag, pages 131-140.
    Ref. : 2003/038



    Text Chat in action
    Jacki O'Neill, David Martin
    Abstract | Group03_textchatFD2.doc (148KB)
    Group 2003 Conference, Sanibel Island, Florida, USA, November 9-12, 2003.
    Ref. : 2003/054



    The adequate design of ethnographic outputs for practice: some explorations of the characteristics of design resources
    Tim Diggins, Peter Tolmie
    Abstract | Design_Resources_withimages.pdf (417.57KB)
    1AD: Frist International Conference on Appliance Design, Bristol, UK May 6-8, 2003.
    Ref. : 2003/029



    Towards interactive text understanding
    Marc Dymetman, Aurelien Max, Kenji Yamada
    Abstract
    ACL 2003, 41st Annual mtg of the association for Computational Linguistics, Sapporo, Japan, July 7-12, 20030.
    Ref. : 2003/048



    Traitement automatique des langues et recherche d'information
    Eric Gaussier, Christian Jacquemin, Pierre Zweigenbaum
    Abstract
    To appear in "Traité Sciences de l'Information - Assistance Intelligente à la recherche d'information" Eds: E.Gaussier et M-H Stefanini
    Ref. : 2003/042



    Virtual operations on virtual networks: the priority union
    Franck Guingne, Florent Nicart, Jean-Marc Champarnaud, Lauri Karttunen, Tamas Gaal, Andre Kempe
    Abstract
    To appear in International Journal of Foundations of computer science
    Ref. : 2003/010



    WFSC - A new weighted finite state compiler
    Andre Kempe, Christof Baeijs, Tamas Gaal, Franck Guingne, Florent Nicart
    Abstract
    8th Int. Conf. on Implementation and Application of Automata (CIAA 03), Santa Barbara, CA, USA, July 16-18, 2003. volume 2759 of Lecture Notes in Computer Science, Springer Verlag, pages 108-119.
    Ref. : 2003/037
  • Year 2002

    A hierarchical model for clustering and categorising documents
    Eric Gaussier, Cyril Goutte, Kris Popat, Francine Chen
    Abstract | gaussier02hierarchical.ps.gz (98.94KB)
    Advances in Information Retrieval -- Proceedings of the 24th BCS-IRSG European Colloquium on IR Research (ECIR-02), Glasgow, March 25-27, 2002. Lecture Notes in Computer Science 2291, pp. 229-247, Springer.
    Ref. : 2002/004



    A rule based Pronoun Resolution System for French
    François Trouilleux
    Abstract
    Proc. of Discourse Anaphora and Anaphor Resolution Colloquium (DAARC 2002), Lisbon, Portugal, Sept. 18-20, 2002.
    Ref. : 2002/035



    Bilingual lexicon extraction: using and enriching multilingual thesauri
    Hervé Dejean, Eric Gaussier, Fatia Sadat
    Abstract | HDejean.pdf (101.55KB)
    Proc. of Terminology Knowledge Extraction, Nancy, France, August 25-30, 2002.
    Ref. : 2002/029



    Bilingual terminology extraction: an approach based on a multilingual thesaurus applicable to comparable corpora
    Hervé Dejean, Eric Gaussier, Fatia Sadat
    Abstract | dejean.pdf (77.43KB)
    Proc. of COLING, Tapei, Taiwan, 24-30 August, 2002.
    Ref. : 2002/025



    Combining labelled and unlabelled data : a case study on Fisher kernels and transductive inference for biological entity recognition
    Cyril Goutte, Hervé Dejean, Eric Gaussier, Jean-Michel Renders, Nicola Cancedda
    Abstract | goutte02combining.ps.gz (41.82KB)
    Proc. of Sixth Conference on Natural Language Learning (CoNLL-2002), Taipei, Taiwan, 24-25 August, 2002.
    Ref. : 2002/024



    Enriching a text by semantic disambiguation for information extraction
    Bernard Jacquemin, Caroline Brun, Claude Roux
    Abstract | LREC2002Bernard.pdf (48.89KB)
    Conference Proceedings LREC, Las Palmas, Spain, June 2, 2002.
    Ref. : 2002/012



    Extraction and recoding of input-Epsilon Cycles in finite state transducers
    Andre Kempe
    Abstract
    volume 313/1 of Theoretical Computer Science, Elsevier Science, pages 145-158.
    Ref. : 2002/045



    Finite State Lazy Operations in NLP
    Franck Guingne, Florent Nicart
    Abstract
    Proc. of CIAA, Tours, France, July 3-5, 2002.
    Ref. : 2002/040



    Insertions et interprétation des expressions pronominales
    François Trouilleux
    Abstract | crra-taln02_ftrouilleux.pdf (93.92KB)
    Proc. of TALN 2002, Nancy, France, 24-27 juin 2002.
    Ref. : 2002/013



    Les outils de TAL au service de la e formation en langues
    Caroline Brun, Thibault Parmentier, Agnes Sandor, Frederique Segond
    Abstract
    in Multilinguisme et Traitement de l Information, Volume dirigé par Frédérique Segond, Hermes, 2002
    Ref. : 2002/038



    Linguistic Processing of Biomedical Texts
    Caroline Hagege, Agnes Sandor, Anne Schiller
    Abstract
    Proceedings of PorTAL 2002, Portugal for Natural Language processing, Taro, Portugal, June 23-26, 2002.
    Ref. : 2002/014



    Normalisation de Documents par Analyse du Contenu à l''aide d''un modèle Sémantique et d''un Générateur
    Aurelien Max
    Abstract
    Proc. TALN-Recital 2002, Nancy, France, 24-27 juin 2002.
    Ref. : 2002/023



    Probabilistic Models for Hierarchical Clustering and Categorisation : Applications in the information Society
    Eric Gaussier, Cyril Goutte
    Abstract | gaussier02probabilistic.ps.gz (50.64KB)
    Proceedings of the Intl. Conf. on Advances in Infrastructure for Electronic Business, Education, Science and Medicine on the Internet, L'Aquila,Italy, January 21-27 2002.
    Ref. : 2002/005



    Redaction multilingue assistée dans le modele MDA
    Caroline Brun, Marc Dymetman
    Abstract
    To appear in a Book "Multilinguisme et traitement d'information", end of year 2002.
    Ref. : 2002/052



    Word-Sequence Kernels
    Nicola Cancedda, Eric Gaussier, Cyril Goutte, Jean-Michel Renders
    Abstract | cancedda03a.pdf (235.79KB)
    The Journal of Machine Learning Research
    Ref. : 2002/010
  • Year 2001

    Language Technologies and patent search and classification
    David Hull, Salah Ait-Mokhtar, Mathieu Chuat, Andreas Eisele, Eric Gaussier, Greg Grefenstette, Pierre Isabelle, Christer Samuelsson, Frederique Segond
    Abstract
    World Patent Information 23 (2001) 265-268
    Ref. : 2004/044



    Towards a Consistent Logical Framework for Ontological Analysis
    Aaron Kaplan
    Abstract
    Proceedings of the International Conference on Formal Ontology in Information Systems, Ogunquit, Maine, October, 2001.
    Ref. : 2001/040