Sean Dague 4915ebb1a7 add SearchResultSet and Hit objects
in an attempt for long term simplification of the source tree, this
is the beginning of a ResultSet and Hit object type. The ResultSet
is contructed from the ElasticSearch returned json structure, and
it builds hits internally.

ResultSet is an iterator, and indexable, so that you can easily loop
through them. Both ResultSet and Hit objects have dynamic attributes
to make accessing the deep data structures easier (and without having
to make everything explicit), and also handling the multiline collapse
correctly.

A basic set of tests is included, as well as sample json dumps for all
the current bugs in the system for additional unit testing. Fortunately
this includes bugs which have hits, and those that don't.

In order to use ResultSet we need to pass everything through
our own SearchEngine object, so we get results back as expected.

We also need to teach ResultSet about facets, as those get used
when attempting to find specific files.

Lastly, we need __len__ implementation for ResultSet to support
the wait loop correctly.

ResultSet lets us simplify a bit of the code in elasticRecheck,
port it over.

There is a short term fix in the test_classifier test to get us
working here until real stub data can be applied.

Change-Id: I7b0d47a8802dcf6e6c052f137b5f9494b1b99501
2013-10-21 13:45:55 -04:00

77 lines
2.1 KiB
Python
Executable File

#!/usr/bin/env python
# All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License"); you may
# not use this file except in compliance with the License. You may obtain
# a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
# License for the specific language governing permissions and limitations
# under the License.
import argparse
import elastic_recheck.elasticRecheck as er
def get_options():
parser = argparse.ArgumentParser(description='Edit hiera yaml.')
parser.add_argument('--file', '-f', help="Queries file",
default="queries.yaml")
return parser.parse_args()
def collect_metrics(classifier):
data = {}
for q in classifier.queries:
results = classifier.hits_by_query(q['query'], size=3000)
rate = {}
for hit in results:
uuid = hit.build_uuid
success = hit.build_status
if success not in rate:
rate[success] = set(uuid)
else:
rate[success].add(uuid)
num_fails = 0
if "FAILURE" in rate:
num_fails = len(rate["FAILURE"])
data[q['bug']] = {
'fails': num_fails,
'hits': rate,
'query': q['query']
}
return data
def print_metrics(data):
print "Elastic recheck known issues"
sorted_data = sorted(data.iteritems(),
key=lambda x: -x[1]['fails'])
for d in sorted_data:
print "Bug: %s => %s" % (d[0], d[1]['query'].rstrip())
for s in d[1]['hits'].keys():
print " %s: %s" % (s, len(d[1]['hits'][s]))
print
def main():
opts = get_options()
classifier = er.Classifier(opts.file)
data = collect_metrics(classifier)
print_metrics(data)
if __name__ == "__main__":
main()