Commits

Kristin Berry authored d4c84106ce1 Merge
Pull request #1338: PIPE-2094 generate run statistics file for internal use

Merge in PIPE/pipeline from PIPE-2094-generate-run-statistics-file-for-internal-use to main * commit 'a40dbe6bb769ec8ccc60b62d007792da83fe205b': (23 commits) PIPE-2094: Remove non-existant file name from error message and also update source sorting to use source.name as the key. PIPE-2094: Remove unused imports, add key sorting to json file, consolidate nested if-statements. PIPE-2094: Remove draft stats values from this branch. Will include in a followup branch later. PIPE-2094: Fix issue that was preventing stats_extractor results from being added to the pool and update structure of flagdata_percentage to not repeat the MS name PIPE-2094: Update output to use TARGET instead of SOURCE and n_targets to n_target. PIPE-2094: Add n_pointings to SOURCE layer and update bands to be a list of band names. PIPE-2094: Remove RegressionExtractor as a parent to StatsExtractor and relegate potential restructuring of this to a future ticket. PIPE-2094: Add a 'SOURCE' level to the enum and output format. Update longdescriptions and two statistics value names based on feedback. PIPE-2094: Slight restructure to break up a large function. PIPE-2094: Update to use the spw id from the first MS for all top-level SPW attributes. Also add a handful of different statistics values to be included. PIPE-2094: Move main stats generating function from pipline_statistics to stats_extractor. Clean up and add documentation to stats_extractor. PIPE-2094: Fix virtual spw calculation PIPE-2094: Add docstrings, additional comments, and tidy-up code. PIPE-2094: Removed temporary flat format option. PIPE-2094: Switch to use virtual spws for SPW-level information and move this level to directly under MOUS instead of underneath the EB level PIPE-2094: Switch to use PipelineStatisticLevel enum. Add ‘EB’ and ‘SPW’ levels above the actual eb and spw information to the output json file. Update to only output EB information for 1 MS datatype to remove redundant output. Update project ID information to output a string instead of a list. PIPE-2094: Update to use nested dict structure for output. PIPE-2094: Add nested format output option and generally clean up code. PIPE-2094: Clean up and restructure code a bit. Add missing eb and mous information to spw-level stats PIPE-2094: Add context into stats extraction. Add some per-EB, per-MOUS, and per-SPW values. Add flat version of format in which level is specified ...

pipeline/h/tasks/exportdata/exportdata.py

Modified
507 507 # The keys are the session names
508 508 # The values are a tuple containing the vislist and the caltables
509 509 sessiondict = collections.OrderedDict()
510 510 for i in range(len(session_names)):
511 511 sessiondict[session_names[i]] = \
512 512 ([os.path.basename(visfile) for visfile in session_vislists[i]], \
513 513 os.path.basename(caltable_file_list[i]))
514 514
515 515 return sessiondict
516 516
517 - def _do_if_auxiliary_products(self, oussid, output_dir, products_dir, vislist, imaging_products_only):
517 + def _do_if_auxiliary_products(self, oussid, output_dir, products_dir, vislist, imaging_products_only, pipeline_stats_file=None):
518 518 """
519 519 Generate the auxiliary products
520 520 """
521 -
522 521 if imaging_products_only:
523 522 contfile_name = 'cont.dat'
524 523 fluxfile_name = 'Undefined'
525 524 antposfile_name = 'Undefined'
526 525 else:
527 526 fluxfile_name = 'flux.csv'
528 527 antposfile_name = 'antennapos.csv'
529 528 contfile_name = 'cont.dat'
530 529 empty = True
531 530
565 564 if timetracker_file_list:
566 565 empty = False
567 566
568 567 # PIPE-1802: look for the selfcal/restore resources
569 568 selfcal_resources_list = []
570 569 if hasattr(self.inputs.context, 'selfcal_resources') and isinstance(self.inputs.context.selfcal_resources, list):
571 570 selfcal_resources_list = self.inputs.context.selfcal_resources
572 571 if selfcal_resources_list:
573 572 empty = False
574 573
574 + # PIPE-2094: check for the pipeline stats file
575 + if pipeline_stats_file and os.path.exists(pipeline_stats_file):
576 + empty = False
577 +
575 578 if empty:
576 579 return None
577 580
578 581 # Define the name of the output tarfile
579 582 tarfilename = f'{oussid}.auxproducts.tgz'
580 583 LOG.info('Saving auxiliary data products in %s', tarfilename)
581 584
582 585 # Open tarfile
583 586 with tarfile.open(os.path.join(products_dir, tarfilename), 'w:gz') as tar:
584 587
618 621 LOG.info('Saving auxiliary data product %s in %s', os.path.basename(timetracker_file), tarfilename)
619 622 else:
620 623 LOG.info('Auxiliary data product timetracker json report file does not exist')
621 624
622 625 # PIPE-1802: Save selfcal restore resources
623 626 for selfcal_resource in selfcal_resources_list:
624 627 if os.path.exists(selfcal_resource):
625 628 tar.add(selfcal_resource, arcname=selfcal_resource)
626 629 LOG.info('Saving auxiliary data product %s in %s', selfcal_resource, tarfilename)
627 630
631 + # PIPE-2094: Save pipeline statistics file
632 + if pipeline_stats_file and os.path.exists(pipeline_stats_file):
633 + tar.add(pipeline_stats_file, arcname=pipeline_stats_file)
634 + LOG.info('Saving pipeline statistics file %s in %s', pipeline_stats_file, tarfilename)
635 + else:
636 + LOG.info("Pipeline statistics file does not exist.")
628 637 tar.close()
629 638
630 639 return tarfilename
631 640
632 641 def _make_pipe_manifest(self, context, oussid, stdfproducts, sessiondict, msvisdict, exportmses, calvisdict,
633 642 exportcalprods, calimages, calimages_fitskeywords, targetimages, targetimages_fitskeywords):
634 643 """
635 644 Generate the manifest file
636 645 """
637 646

Everything looks good. We'll let you know here if there's anything you should know about.

Add shortcut