Show last authors
1 {{toc/}}
2
3 We need to put all extensions descriptors from all repository in some index to be able to search among them quickly.
4
5 = Implementation =
6
7 How to implement that ?
8
9 == Maven Indexer (former Nexus Indexer) ==
10
11 http://maven.apache.org/maven-indexer/
12
13 That's what is used by M2Eclipse to index maven project for pretty much the same need.
14
15 Pros:
16
17 * designed for that
18 * we need to use that anyway to download indexes from maven repositories and parse them
19 * built in remote maven repositories index incremental fetcher
20
21 Cons:
22
23 * very maven oriented and it will miss some informations we want (contains really [[minimal information>>http://maven.apache.org/maven-indexer-archives/maven-indexer-LATEST/indexer-core/index.html]])
24 * stored in a file somewhere
25 * does not store dependencies
26 * works only with indexed repositories, however indexing a repository is a 1 line command that can be easily scheduled to run periodically
27
28 == A custom Lucene index ==
29
30 Pros:
31
32 * full text search
33 * Maven Indexer is using Lucene too so that's a good sign I guess
34 * more control over the information stored compared to Maven Indexer
35
36 Cons:
37
38 * have to develop it (should not be too hard either)
39 * stored in a file somewhere
40 * not designed to store dependencies relations
41
42 == JCR ==
43
44 I don't knows it very well
45
46 Pros:
47
48 * full text search capabilities
49 * can store dependencies relations
50
51 ?:
52
53 * scoring
54
55 == XWiki database ==
56
57 Pro:
58
59 * no need to store some file somewhere
60 * easier to store dependencies relations
61
62 Cons:
63
64 * fill the database with datas which are not really needed since that's after all only a cache to speedup things
65 * Lucene is better for full text search which is the main use case
66
67 == Other SQL based database ==
68
69 == Other NoSQL based database ==
70
71 = Getting repositories indexes =
72
73 == Maven ==
74
75 == No index provided ==
76
77 First thing: the simplest possible maven repository does not provide any index of any kind which mean for theses one the only way is to follow link in a HTTP request and it probably takes ages to do (actually probably not since these kind of repository are generally small repositories with one project or so) but it's not very hard ;)
78
79 * There is really not much point in wasting time to support non-indexed maven repos. In a 5 minute search it is hard to find a maven repository in the wild that is not already indexed by some form of repository manager or even manually.
80 * Manually indexing a maven repo is a 1 line command
81 ** http://maven.apache.org/maven-indexer-archives/maven-indexer-LATEST/indexer-cli/index.html
82 ** https://github.com/apache/maven-indexer/tree/master/indexer-cli
83 ** Download: http://central.maven.org/maven2/org/apache/maven/indexer/indexer-cli/
84
85 === archetype-catalog.xml ===
86
87 Very easy to parse but contains almost nothing: groupid, artifactid and version. Nothing else...
88
89 That means we will need to download all the pom.xml in that repository to get useful informations so it's pretty slow too.
90
91 === Maven indexes ===
92
93 See http://maven.apache.org/maven-indexer/index.html
94
95 Very complete (even contains Java classes for jar artifact for example and we could imagine provide wiki pages for a xar artifact since Nexus is extendable).
96
97 * Actually, the available indexed information is quite minimal, at least concerning out use cases. We can easily extend the indexed information, but we would only be able to do that for the indexes of repositories that we manage, all other repositories in the wild will have default indexes with default minimal information.
98 * http://maven.apache.org/maven-indexer-archives/maven-indexer-LATEST/indexer-core/index.html
99
100 Some helpers:
101
102 * https://github.com/apache/maven-indexer/tree/master/indexer-examples
103 * https://github.com/cstamas/maven-indexer-examples ([[obsolete>>https://github.com/cstamas/maven-indexer-examples/issues/4]], use the one above)

Get Connected