Extension Manager - Indexer

Last modified by Ecaterina Moraru (Valica) on 2017/10/05 12:59

 Requirements
 Completed

Description

We need to put all extensions descriptors from all repository in some index to be able to search among them quickly.

Implementation

How to implement that ?

Maven Indexer (former Nexus Indexer)

http://maven.apache.org/maven-indexer/

That's what is used by M2Eclipse to index maven project for pretty much the same need.

Pros:

  • designed for that
  • we need to use that anyway to download indexes from maven repositories and parse them
  • built in remote maven repositories index incremental fetcher

Cons:

  • very maven oriented and it will miss some informations we want (contains really minimal information)
  • stored in a file somewhere
  • does not store dependencies
  • works only with indexed repositories, however indexing a repository is a 1 line command that can be easily scheduled to run periodically

A custom Lucene index

Pros:

  • full text search
  • Maven Indexer is using Lucene too so that's a good sign I guess
  • more control over the information stored compared to Maven Indexer

Cons:

  • have to develop it (should not be too hard either)
  • stored in a file somewhere
  • not designed to store dependencies relations

JCR

I don't knows it very well

Pros:

  • full text search capabilities
  • can store dependencies relations

?:

  • scoring

XWiki database

Pro:

  • no need to store some file somewhere
  • easier to store dependencies relations

Cons:

  • fill the database with datas which are not really needed since that's after all only a cache to speedup things
  • Lucene is better for full text search which is the main use case

Other SQL based database

Other NoSQL based database

Getting repositories indexes

Maven

No index provided

First thing: the simplest possible maven repository does not provide any index of any kind which mean for theses one the only way is to follow link in a HTTP request and it probably takes ages to do (actually probably not since these kind of repository are generally small repositories with one project or so) but it's not very hard emoticon_wink

archetype-catalog.xml

Very easy to parse but contains almost nothing: groupid, artifactid and version. Nothing else...

That means we will need to download all the pom.xml in that repository to get useful informations so it's pretty slow too.

Maven indexes

See http://maven.apache.org/maven-indexer/index.html

Very complete (even contains Java classes for jar artifact for example and we could imagine provide wiki pages for a xar artifact since Nexus is extendable).

Some helpers:


 

Tags:
Created by Ecaterina Moraru (Valica) on 2013/11/11 12:52
    

Get Connected