Code documentation¶
nctime¶
Module author: Guillaume Levavasseur <glipsl@ipsl.jussieu.fr>
-
nctime.nctcck.
get_args
(args=None)[source]¶ Returns parsed command-line arguments.
Returns: The argument parser Return type: argparse.Namespace
Module author: Guillaume Levavasseur <glipsl@ipsl.jussieu.fr>
-
nctime.nctxck.
get_args
(args=None)[source]¶ Returns parsed command-line arguments.
Returns: The argument parser Return type: argparse.Namespace
overlap¶
platform: | Unix |
---|---|
synopsis: | Highlight chunked NetCDF files producing overlap in a time series. |
-
nctime.overlap.main.
get_overlaps
(g, shortest)[source]¶ - Returns all overlapping files (default) as a list of tuples. Each tuple gathers:
- The higher overlap bound,
- The date to cut the file in order to resolve the overlap,
- The corresponding cutting timestep.
Parameters: - g (networkx.DiGraph()) – The directed graph
- shortest (list) – The most consecutive files list (from nx.DiGraph().shortest_path)
Returns: The filenames
Return type: list
-
nctime.overlap.main.
resolve_overlap
(ffp, pattern, from_date=None, to_date=None, cutting_timestep=None, partial=False)[source]¶ Resolve overlapping files. If full overlap, the corresponding file is removed. If partial overlap, the corresponding file is truncated into a new one and the old file is removed.
Parameters: - ffp (str) – The file full path
- pattern (str) – The filename pattern
- from_date (int) – Overlap starting date
- to_date (int) – Overlap ending date
- cutting_timestep (int) – Time step to cut the file
- partial (boolean) – Resolve partial overlap if True
-
nctime.overlap.main.
extract_dates
(ffp)[source]¶ Extract dates attributes from netCDF file..
- start = the start date of the file sub-period
- end = the end date of the file sub-period
- next = the date next to the end date depending on the frequency, the calendar, etc.
- first_step = the first time axis step
- last_step = the last time axis step
- path = the file full path
Parameters: ffp (str) – The file full path to process
-
nctime.overlap.main.
create_nodes
(fh)[source]¶ Creates the node into the corresponding Graph(). One directed graph per dataset. Each file is analysed to be a node in its graph. A node is a filename with some attributes:
- start = the start date of the file sub-period
- end = the end date of the file sub-period
- next = the date next to the end date depending on the frequency, the calendar, etc.
- first_step = the first time axis step
- last_step = the last time axis step
- path = the file full path
Parameters: fh (handler.Filename) – The filename handler
-
nctime.overlap.main.
create_edges
(gid)[source]¶ Creates the edges between nodes into the corresponding Graph(). One directed graph per dataset.
- Builds the “START” node with the appropriate edges to the nodes with the earliest start date
- Builds the “END” node with the appropriate edges from the nodes with the latest end date
- Builds edges between a node and its “backwards” nodes
Parameters: gid (str) – The graph id
-
nctime.overlap.main.
evaluate_graph
(gid)[source]¶ Evaluate the directed graph looking for a shortest path between “START” and “END” nodes. If a shorted path is found, it looks for the potential overlaps. Partial and full overlaps are supported. If no shortest path found, it creates a new node “BREAK” in the path.
Parameters: gid (str) – The graph id
-
nctime.overlap.main.
format_path
(path, partial_overlaps, full_overlaps)[source]¶ Formats the message to print as a diagnostic. It walks through the evaluated path of the directed graph and print the filenames with useful info.
Parameters: - path (list) – The node path as a result of the directed graph evaluation
- partial_overlaps (dict) – Dictionary of partial overlaps
- full_overlaps (list) – List of full overlapping files
Returns: The formatted diagnostic to print
Return type: str
-
nctime.overlap.main.
yield_filedef
(paths)[source]¶ Yields all file definition produced by dr2xml as input for XIOS.
Parameters: paths (list) – The list of filedef path Returns: The dr2xml files Return type: iter
-
nctime.overlap.main.
get_patterns_from_filedef
(path)[source]¶ Parses dr2xml files. Each filename pattern is deserialized as a dictionary of facet: value.
Parameters: path (str) – The path to scan Returns: The filename deserialized with facets Return type: dict
-
nctime.overlap.main.
initializer
(keys, values)[source]¶ Initialize process context by setting particular variables as global variables.
Parameters: - keys (list) – Argument name list
- values (list) – Argument value list
-
nctime.overlap.main.
run
(args=None)[source]¶ Main process that:
- Instantiates processing context,
- Deduces start, end and next date from each filenames,
- Builds the DiGraph,
- Detects the shortest path between dates if exists,
- Detects broken path between dates if exists,
- Removes the overlapping files.
Parameters: args (ArgumentParser) – Command-line arguments parser
platform: | Unix |
---|---|
synopsis: | File handler for time axis. |
-
class
nctime.overlap.handler.
Filename
(ffp)[source]¶ Handler providing methods to deal with file processing.
-
get_start_end_dates
(pattern, calendar)[source]¶ Wraps and records
get_start_end_dates_from_filename()
results.Parameters: - Object pattern (re) – The filename pattern as a regex (from re library).
- calendar (str) – The NetCDF calendar attribute
Returns: Start and end dates as number of days since the referenced date
Return type: float
-
nc_att_get
(attribute, variable=None)[source]¶ Get attribute from NetCDF file. Default is to find into global attributes. If attribute key is not found, get the closest key name instead.
Parameters: - attribute (str) – The attribute key to get
- variable (str) – The variable from which to find the attribute. Global is None.
Returns: The attribute value
Return type: str
-
platform: | Unix |
---|---|
synopsis: | Processing context used in this module. |
-
class
nctime.overlap.context.
ProcessingContext
(args)[source]¶ Encapsulates the processing context/information for main process.
Parameters: args (ArgumentParser) – Parsed command-line arguments Returns: The processing context Return type: ProcessingContext
-
nctime.overlap.context.
yield_xml_from_card
(card_path)[source]¶ Yields XML path from run.card and config.card attributes.
Parameters: card_path (str) – Directory including run.card and config.card Returns: The XML paths to use Return type: iter
platform: | Unix |
---|---|
synopsis: | Constants used in this module. |
platform: | Unix |
---|---|
synopsis: | Custom exceptions used in this module. |
axis¶
platform: | Unix |
---|---|
synopsis: | Rewrite and/or check time axis of MIP NetCDF files. |
-
nctime.axis.main.
process
(ffp)[source]¶ Process time axis checkup and rewriting if needed.
Parameters: ffp (str) – The file full path to process Returns: The file status Return type: list
-
nctime.axis.main.
initializer
(keys, values)[source]¶ Initialize process context by setting particular variables as global variables.
Parameters: - keys (list) – Argument name list
- values (list) – Argument value list
-
nctime.axis.main.
run
(args=None)[source]¶ Main process that:
- Instantiates processing context,
- Defines the referenced time properties,
- Instantiates threads pools,
- Prints or logs the time axis diagnostics.
Parameters: args (ArgumentParser) – Command-line arguments parser
platform: | Unix |
---|---|
synopsis: | File handler for time axis. |
-
class
nctime.axis.handler.
File
(ffp, pattern, ref_units, ref_calendar, input_start_timestamp=None, input_end_timestamp=None)[source]¶ Handler providing methods to deal with file processing.
Returns: The file handler Return type: File -
build_time_axis
()[source]¶ Rebuilds time axis from date axis, depending on MIP frequency, calendar and instant status.
Returns: The corresponding theoretical time axis Return type: numpy.array
-
build_time_bounds
()[source]¶ Rebuilds time boundaries from the start date, depending on MIP frequency, calendar and instant status.
Returns: The corresponding theoretical time boundaries as a [n, 2] array Return type: numpy.array
-
check_axis_length
(axis)[source]¶ numpy.arange could suddenly add last endpoint to the array in the case of high number of steps Due to rounding float issue and pre-calculated length in memory. As a workaround, always check length and remove last point if length are different.
Parameters: axis (numpy.array) – The axis to check Returns: The cut axis Return type: numpy.array
-
nc_var_delete
(variable)[source]¶ Delete a NetCDF variable using NCO operators. A unique filename is generated to avoid multiprocessing errors. To overwrite the input file, the source file is dump using the
cat
Shell command-line to avoid Python memory limit.Parameters: variable (str) – The variable to delete Raises: Error – If the deletion failed
-
nc_att_delete
(variable, attribute)[source]¶ Delete a NetCDF dimension attribute using NCO operators. A unique filename is generated to avoid multiprocessing errors. To overwrite the input file, the source file is dump using the
cat
Shell command-line to avoid Python memory limit.Parameters: - attribute (str) – The attribute to delete
- variable (str) – The variable that has the attribute
Raises: Error – If the deletion failed
-
nc_var_overwrite
(variable, data)[source]¶ Rewrite variable to NetCDF file without copy.
Parameters: - variable (str) – The variable to replace
- array data (float) – The data array to overwrite
-
nc_att_overwrite
(attribute, data, variable=None)[source]¶ Rewrite attribute to NetCDF file without copy.
Parameters: - attribute (str) – The attribute to replace
- data (str) – The string to add to overwrite
- variable (str) – The variable that has the attribute, default is global attributes
-
nc_att_get
(attribute, variable=None)[source]¶ Get attribute from NetCDF file. Default is to find into global attributes. If attribute key is not found, get the closest key name instead.
Parameters: - attribute (str) – The attribute key to get
- variable (str) – The variable from which to find the attribute. Global is None.
Returns: The attribute value
Return type: str
-
platform: | Unix |
---|---|
synopsis: | Custom exceptions used in this module. |
-
exception
nctime.axis.custom_exceptions.
NetCDFVariableRemoveFail
(variable, path)[source]¶ Raised when NetCDF variable removal fails.
-
exception
nctime.axis.custom_exceptions.
NetCDFAttributeRemoveFail
(attribute, path, variable=None)[source]¶ Raised when NetCDF attribute removal fails.
platform: | Unix |
---|---|
synopsis: | Constants used in this module. |
platform: | Unix |
---|---|
synopsis: | Processing context used in this module. |
utils¶
platform: | Unix |
---|---|
synopsis: | Useful functions to collect data from directories. |
-
class
nctime.utils.collector.
Collector
(sources, spinner=False)[source]¶ Base collector class to yield regular NetCDF files.
Parameters: sources (list) – The list of sources to parse Returns: The data collector Return type: iter
-
class
nctime.utils.collector.
FilterCollection
[source]¶ Regex dictionary with a call method to evaluate a string against several regular expressions. The dictionary values are 2-tuples with the regular expression as a string and a boolean indicating to match (i.e., include) or non-match (i.e., exclude) the corresponding expression.
platform: | Unix |
---|---|
synopsis: | Constants used in this package. |
platform: | Unix |
---|---|
synopsis: | Processing context used in this module. |
-
class
nctime.utils.context.
BaseContext
(args)[source]¶ Encapsulates the processing context/information for main process.
Parameters: args (ArgumentParser) – Parsed command-line arguments Returns: The processing context Return type: ProcessingContext
platform: | Unix |
---|---|
synopsis: | Useful functions to use with this package. |
-
class
nctime.utils.misc.
ncopen
(path, mode='r')[source]¶ Properly opens a netCDF file
Parameters: path (str) – The netCDF file full path Returns: The netCDF dataset object Return type: netCDF4.Dataset
-
nctime.utils.misc.
match
(pattern, string, inclusive=True)[source]¶ Validates a string against a regular expression. Only match at the beginning of the string. Default is to match inclusive regex.
Parameters: - pattern (str) – The regular expression to match
- string (str) – The string to test
- inclusive (boolean) – False if negative matching (i.e., exclude the regex)
Returns: True if it matches
Return type: boolean
-
nctime.utils.misc.
get_project
(ffp)[source]¶ Get project identifier from netCDF file.
Parameters: ffp (str) – The file full path Returns: The lower-case project id Return type: str
-
class
nctime.utils.misc.
ProcessContext
(args)[source]¶ Encapsulates the processing context/information for child process.
Parameters: args (dict) – Dictionary of argument to pass to child process Returns: The processing context Return type: ProcessContext
platform: | Unix |
---|---|
synopsis: | Useful functions to use with this package. |
-
class
nctime.utils.custom_print.
COLOR
(color=None)[source]¶ Define color object for print statements Default is no color (i.e., restore original color)
-
class
nctime.utils.custom_print.
_TAGS
[source]¶ Tags strings for print statements These are evaluated as properties, in order to defer until after enable_colors or disable_colors has been called during initialisation
-
class
nctime.utils.custom_print.
Print
[source]¶ Class to manage and dispatch print statement depending on log and debug mode.
platform: | Unix |
---|---|
synopsis: | Class and methods used to parse command-line arguments. |
-
class
nctime.utils.parser.
MultilineFormatter
(prog, default_columns=120)[source]¶ Custom formatter class for argument parser to use with the Python argparse module.
-
class
nctime.utils.parser.
DirectoryChecker
(option_strings, dest, nargs=None, const=None, default=None, type=None, choices=None, required=False, help=None, metavar=None)[source]¶ Custom action class for argument parser to use with the Python argparse module.
-
class
nctime.utils.parser.
InputChecker
(option_strings, dest, nargs=None, const=None, default=None, type=None, choices=None, required=False, help=None, metavar=None)[source]¶ Checks if the supplied input exists.
-
class
nctime.utils.parser.
CodeChecker
(option_strings, dest, nargs=None, const=None, default=None, type=None, choices=None, required=False, help=None, metavar=None)[source]¶ Checks if the supplied input exists.
-
class
nctime.utils.parser.
TimestampChecker
(option_strings, dest, nargs=None, const=None, default=None, type=None, choices=None, required=False, help=None, metavar=None)[source]¶ Checks if the supplied timestamp is valid.
-
class
nctime.utils.parser.
CalendarChecker
(option_strings, dest, nargs=None, const=None, default=None, type=None, choices=None, required=False, help=None, metavar=None)[source]¶ Checks if the supplied calendar is valid.
-
class
nctime.utils.parser.
TimeUnitsChecker
(option_strings, dest, nargs=None, const=None, default=None, type=None, choices=None, required=False, help=None, metavar=None)[source]¶ Checks if the supplied time units has valid format.
-
nctime.utils.parser.
regex_validator
(string)[source]¶ Validates a Python regular expression syntax.
Parameters: string (str) – The string to check Returns: The Python regex Return type: re.compile Raises: Error – If invalid regular expression
-
nctime.utils.parser.
positive_only
(value)[source]¶ Validates a positive number.
Parameters: value (str) – The value submitted Returns:
-
nctime.utils.parser.
processes_validator
(value)[source]¶ Validates the max processes number.
Parameters: value (str) – The max processes number submitted Returns:
-
nctime.utils.parser.
inc_converter
(string)[source]¶ Checks the increment value syntax.
Parameters: string (str) – The string to check Returns: The key/value tuple Return type: list Raises: Error – If invalid pair syntax
-
nctime.utils.parser.
table_inc_converter
(pair)[source]¶ Checks the key value syntax.
Parameters: pair (str) – The key/value pair to check Returns: The key/value pair Return type: list Raises: Error – If invalid pair syntax
platform: | Unix |
---|---|
synopsis: | Custom exceptions used in this package. |
-
exception
nctime.utils.custom_exceptions.
InvalidNetCDFFile
(path)[source]¶ Raised when not a NetCDF file.
-
exception
nctime.utils.custom_exceptions.
RenamingNetCDFFailed
(src, dst, exists=False)[source]¶ Raised when NetCDF renaming failed.
-
exception
nctime.utils.custom_exceptions.
NoNetCDFAttribute
(attribute, path, variable=None)[source]¶ Raised when a NetCDF attribute is missing.
-
exception
nctime.utils.custom_exceptions.
NoNetCDFVariable
(variable, path)[source]¶ Raised when a NetCDF variable is missing.
-
exception
nctime.utils.custom_exceptions.
NetCDFTimeStepNotFound
(value, path)[source]¶ Raised when a NetCDF time index is not found.
-
exception
nctime.utils.custom_exceptions.
EmptyTimeAxis
(path)[source]¶ Raised when a NetCDF time axis is empty.
-
exception
nctime.utils.custom_exceptions.
InvalidFrequency
(frequency)[source]¶ Raised when frequency is unknown.
-
exception
nctime.utils.custom_exceptions.
InvalidTable
(frequency)[source]¶ Raised when table is unknown.
-
exception
nctime.utils.custom_exceptions.
InvalidClimatologyFrequency
(frequency)[source]¶ Raised when climatology frequency is unknown.
-
exception
nctime.utils.custom_exceptions.
InvalidUnits
(units)[source]¶ Raised when time units is unknown.
-
exception
nctime.utils.custom_exceptions.
NoFileFound
(paths)[source]¶ Raised when frequency no file found.
-
exception
nctime.utils.custom_exceptions.
NoRunCardFound
(path)[source]¶ Raised when no file patterns found in filedef.
-
exception
nctime.utils.custom_exceptions.
NoConfigCardFound
(path)[source]¶ Raised when no file patterns found in filedef.
platform: | Unix |
---|---|
synopsis: | Methods used to deal with NetCDF file time axis. |
-
class
nctime.utils.time.
TimeInit
(ref, tunits_default=None)[source]¶ Encapsulates the time properties from first file into processing context. These properties has to be used as reference for all files into the directory.
- The calendar, the frequency and the realm are read from NetCDF global attributes and use to detect instantaneous time axis,
- The NetCDF time units attribute has to be unchanged in respect with CF convention and archives designs.
Parameters: - ref (str) – The reference file full path
- tunits_default (str) – The default time units if exists
Raises: - Error – If NetCDF time units attribute is missing
- Error – If NetCDF frequency attribute is missing
- Error – If NetCDF realm attribute is missing
- Error – If NetCDF calendar attribute is missing
-
nctime.utils.time.
control_time_units
(tunits, tunits_default=None)[source]¶ Controls the time units format as at least “days since YYYY-MM-DD”. The time units can be forced within configuration file using the
time_units_default
option.Parameters: - tunits (str) – The NetCDF time units string from file
- tunits_default – The default time units that should be used
Returns: The appropriate time units string formatted and controlled depending on the project
Return type: str
-
nctime.utils.time.
convert_time_units
(tunits, table, frequency)[source]¶ Converts default time units from file into time units using the MIP frequency. As en example, for a 3-hourly file, the time units “days since YYYY-MM-DD” becomes “hours since YYYY-MM-DD”.
Parameters: - tunits (str) – The NetCDF time units string from file
- frequency (str) – The time frequency
- table (str) – The MIP table
Returns: The converted time units string
Return type: str
-
nctime.utils.time.
untruncated_timestamp
(timestamp)[source]¶ Returns proper digits for yearly and monthly truncated timestamps. The dates from filename are filled with the 0 digit to reach 14 digits. Consequently, yearly dates starts at January 1st and monthly dates starts at first day of the month.
Parameters: timestamp (str) – A date string from a filename Returns: The filled timestamp Return type: str
-
nctime.utils.time.
truncated_timestamp
(date, length)[source]¶ Returns proper digits depending on datetime object.
Parameters: - date (datetime.datetime) – Datetime or phony datetime object
- length (int) – The timestamp length expected
Returns: The corresponding timestamp
Return type: str
-
nctime.utils.time.
num2date
(num_axis, units, calendar)[source]¶ A wrapper from
netCDF4.num2date
able to handle “years since” and “months since” units. If time units are not “years since” or “months since”, calls usualnetcdftime.num2date
.Parameters: - num_axis (numpy.array) – The numerical time axis following units
- units (str) – The proper time units
- calendar (str) – The NetCDF calendar attribute
Returns: The corresponding date axis
Return type: array
-
nctime.utils.time.
date2num
(date_axis, units, calendar)[source]¶ A wrapper from
netCDF4.date2num
able to handle “years since” and “months since” units. If time units are not “years since” or “months since” calls usualnetcdftime.date2num
.Parameters: - date_axis (numpy.array) – The date axis following units
- units (str) – The proper time units
- calendar (str) – The NetCDF calendar attribute
Returns: The corresponding numerical time axis
Return type: array
-
nctime.utils.time.
add_month
(date, months_to_add)[source]¶ Finds the next month from date.
Parameters: - date (netcdftime.datetime) – Accepts datetime or phony datetime from
netCDF4.num2date
. - months_to_add (int) – The number of months to add to the date
Returns: The final date
Return type: netcdftime.datetime
- date (netcdftime.datetime) – Accepts datetime or phony datetime from
-
nctime.utils.time.
add_year
(date, years_to_add)[source]¶ Finds the next year from date.
Parameters: - date (netcdftime.datetime) – Accepts datetime or phony datetime from
netCDF4.num2date
. - years_to_add (int) – The number of years to add to the date
Returns: The final date
Return type: netcdftime.datetime
- date (netcdftime.datetime) – Accepts datetime or phony datetime from
-
nctime.utils.time.
get_start_end_dates_from_filename
(filename, pattern, table, frequency, calendar, start=None, end=None)[source]¶ Returns datetime objects for start and end dates from the filename. To rebuild a proper time axis, the dates from filename are expected to set the first time boundary and not the middle of the time interval.
Parameters: - filename (str) – The filename
- Object pattern (re) –
The filename pattern as a regex (from re library).
- table (str) – The MIP table
- frequency (str) – The time frequency
- calendar (str) – The NetCDF calendar attribute
- start (str) – The timestamp to consider as start instead of filename timestamps
- end (str) – The timestamp to consider as end instead of filename timestamps
Returns: Start and end dates from the filename
Return type: netcdftime.datetime
-
nctime.utils.time.
get_last_timestep
(ffp)[source]¶ Returns last time steps from time axis of a NetCDF file. :param str ffp: The file full path :returns: The last timestep :rtype: int
-
nctime.utils.time.
get_next_timestep
(ffp, current_timestep)[source]¶ Returns next time step from time axis given the current one.
Parameters: - ffp (str) – The file full path
- current_timestep (int) – The current_timestep
Returns: The next timestep
Return type: int
-
nctime.utils.time.
trunc
(array, ndecimals)[source]¶ Truncates each item of a Numpy array to the decimal ndecimals
Parameters: - array (numpy.array) – The array to truncate
- ndecimals (int) – Number of decimals to keep
Returns: The truncated array
Return type: numpy.array
-
nctime.utils.time.
time_inc
(table, frequency)[source]¶ Returns the time incrementation and time units depending on the MIP frequency and table.
Parameters: - table (str) – The MIP table
- frequency (str) – The MIP frequency
Returns: The corresponding time value and units
Return type: list
-
nctime.utils.time.
dates2int
(dates)[source]¶ Converts (a list of) dates as integers.
Parameters: dates (list) – A list of datetime or phony datetime objects Returns: The corresponding formatted integers Return type: list or int
-
nctime.utils.time.
dates2str
(dates, iso_format=True)[source]¶ Converts (a list of) dates in format: %Y%m%d %H:%M:%s.
Parameters: - dates (netcdftime.datetime/list) – A list of datetime or phony datetime objects
- iso_format (boolean) – ISO format date if True
Returns: The corresponding formatted strings
Return type: list or str
-
nctime.utils.time.
date2str
(date, iso_format=True)[source]¶ Converts date in format: %Y%m%d %H:%M:%s.
Parameters: - date (netcdftime.datetime) – A datetime or phony datetime objects
- iso_format (boolean) – ISO format date if True
Returns: The corresponding formatted string
Return type: str
-
nctime.utils.time.
str2dates
(strings, iso_format=True)[source]¶ Converts (a list of) string in format: %Y%m%d %H:%M:%s into datetime objects.
Parameters: - strings (string/list) – A list of string to convert
- iso_format (boolean) – ISO format date if True
Returns: A list of datetime or phony datetime objects
Return type: list or str
-
nctime.utils.time.
str2date
(string, iso_format=True)[source]¶ Converts string date format: %Y%m%d %H:%M:%s into datetime object
Parameters: - string (str) – The string to format
- iso_format (boolean) – ISO format date if True
Returns: A datetime or phony datetime objects
Return type: netcdftime.datetime
Module author: Levavasseur Guillaume (CNRS/IPSL) <glipsl@ipsl.fr>