atom feed19 messages in org.apache.lucene.solr-devRe: Lucene/Solr 7
FromSent OnAttachments
Adrien GrandJan 24, 2017 8:52 am 
Ishan ChattopadhyayaJan 24, 2017 9:16 am 
Joel BernsteinJan 24, 2017 9:28 am 
Michael McCandlessJan 24, 2017 9:43 am 
Joel BernsteinJan 24, 2017 10:02 am 
Tommaso TeofiliJan 24, 2017 10:09 am 
Michael McCandlessJan 24, 2017 11:01 am 
David SmileyJan 24, 2017 8:09 pm 
Uwe SchindlerJan 25, 2017 2:29 am 
Christine Poerschke (BLOOMBERG/ LONDON)Jan 25, 2017 2:38 am 
Ramkumar R. AiyengarJan 25, 2017 4:50 am 
Tomás Fernández LöbbeJan 25, 2017 9:36 am 
Anshum GuptaJan 25, 2017 1:21 pm 
Shawn HeiseyJan 26, 2017 11:51 am 
Ishan ChattopadhyayaJan 26, 2017 1:08 pm 
Adrien GrandJan 26, 2017 1:09 pm 
Adrien GrandJan 26, 2017 1:40 pm 
Shawn HeiseyJan 26, 2017 4:48 pm 
Tomás Fernández LöbbeJan 26, 2017 5:17 pm 
Subject:Re: Lucene/Solr 7
From:Tomás Fernández Löbbe (toma@gmail.com)
Date:Jan 26, 2017 5:17:08 pm
List:org.apache.lucene.solr-dev

The other solution, delaying removal until 8.0, doesn't sound like a bad

idea either. I think that's what everyone is saying here, legacy numeric types will be supported in Solr at least until 8.0. It could happen that they are moved from Lucene to Solr, but will for sure be supported during the life of Solr 7.X. "deprecating" them in Solr just means changing the default schemas, etc, and document that people should use PointFields instead, and keeping them around for those users who have built their indices with them, as you said, the same as it was done with the previous numeric types.

On Thu, Jan 26, 2017 at 4:48 PM, Shawn Heisey <apa@elyograg.org> wrote:

On 1/26/2017 2:40 PM, Adrien Grand wrote:

I don't think this statement is accurate. Why would it affect Lucene users if they started using points for new indices when they upgraded to Lucene 6?

If that's the situation, then there won't be an issue. I suspect that there are quite a few users that have extremely large indexes that are difficult to reindex, and long-standing config/code using the legacy types. Those kinds of users will find that they cannot easily upgrade.

There are two cases: either their current configuration uses points, which means the index was created with Lucene 6.0+, which will be fine with Lucene 7. Or the index uses legacy numerics, but that means the index was created with Lucene 5 so Lucene 7 cannot read it anyway.

No Solr 6.x users have points, because the capability isn't there yet.

Points were first available (in actual released code) with the release of 6.0 ... but the legacy types were already deprecated before 6.0 was released. There was never any overlap where both types were considered fully viable. Perhaps Solr should have added points before the 6.0 release, but that didn't happen.

I don't expect users of Solr 5.x to be able to upgrade directly to version 7, but users of 6.x should be able to. Right now, that won't be possible if there are numeric types in the index. Most indexes have at least one numeric type.

On my own installs, I never upgrade with an existing index. I am able to do this because I've arranged my Solr servers in such a way that I can always completely rebuild one copy of my index from scratch while another copy remains online and serving requests, kept current independently of the rebuilding copy. Each copy of the index is not not connected to the others in ANY way -- they can use entirely different versions and entirely different configs if that's what I need.

Not all users have the luxury that I do. Users with a typical replicated SolrCloud 6.x will be faced with a situation where they cannot do a rolling upgrade of their cloud to 7.x, which is going to make the upgrade process ugly. At some point they're going to have to completely rebuild all of their indexes. One SolrCloud user that I know of has *five terabytes* of index data in SolrCloud. Reindexing is a logistical nightmare, and something that I'm sure they don't want to combine with an upgrade.

The situation with ES is probably not quite as bad as what Solr is facing, but some users will still be impacted. The impact will be lessened with one of the two ideas I mentioned. An operation somewhere in 6.x that can convert the legacy numeric fields in the index to points would be REALLY good to have. The configuration of course will need changing to match the index.

The other solution, delaying removal until 8.0, doesn't sound like a bad idea either. Those old types are very widely used, and quite fundamental, so keeping them around for an extra major version will greatly reduce the amount of pain that users must endure when they upgrade.

This might be characterized as Lucene being punished for Solr's lack of foresight. I don't really agree with that characterization, but it's not entirely wrong, either. How much actual impact on development would result from keeping the legacy types around a while longer? Are there changes planned that would be extremely difficult or impossible with legacy code still present?

When the decision was made to combine the Lucene and Solr codebases, it had to be known that there would be a certain amount of shared baggage. I'm not sure that I would support that decision if I had any opportunity to comment, but it was made before I got involved in the code. If our community ever thinks about separating the two projects, I would support that. I'm sure it would be an enormous amount of work.

Historically, there was a similar situation in Solr where a whole bunch of old numeric types were deprecated in 3.x, with preference for the Trie types available much earlier. Those older types were not actually removed until 5.0. Small difference from the situation we're now facing -- I don't see any Lucene deprecations involved in that. It was all done on the Solr side.