geonature.core.imports.utils#
Attributes#
Classes#
Enum where members are also (and must be) ints |
Functions#
|
Clean an import at a specific step. |
|
Get the size of a file in bytes. |
|
Detects the encoding of a file. |
|
Detects the delimiter used in a CSV file. |
|
Preprocesses values in a DataFrame depending if the field contains multiple values (e.g. additional_data) or not. |
Insert the data from the import file into the transient table. |
|
|
Build a dictionary that maps the source column names to the corresponding field and values. |
|
Load data from the transient table into a pandas dataframe. |
|
Update the transient table with the data from the dataframe. |
|
Generate a PDF document from a template. |
|
Get the mapping data for a given import and entity. |
|
|
|
Compute the bounding box of an entity with a geometry in the given import, based on its |
Module Contents#
- class geonature.core.imports.utils.ImportStep[source]#
Bases:
enum.IntEnum
Enum where members are also (and must be) ints
- geonature.core.imports.utils.clean_import(imprt: geonature.core.imports.models.TImports, step: ImportStep) None [source]#
Clean an import at a specific step.
Parameters#
- imprtTImports
The import to clean.
- stepImportStep
The step at which to clean the import.
- geonature.core.imports.utils.get_file_size(file_: IO) int [source]#
Get the size of a file in bytes.
Parameters#
- file_IO
The file to get the size of.
Returns#
- int
The size of the file in bytes.
- geonature.core.imports.utils.detect_encoding(file_: IO) str [source]#
Detects the encoding of a file.
Parameters#
- file_IO
The file to detect the encoding of.
Returns#
- str
The detected encoding. If no encoding is detected, then « UTF-8 » is returned.
- geonature.core.imports.utils.detect_separator(file_: IO, encoding: str) str | None [source]#
Detects the delimiter used in a CSV file.
Parameters#
- file_IO
The file object to detect the delimiter of.
- encodingstr
The encoding of the file.
Returns#
- Optional[str]
The delimiter used in the file, or None if no delimiter is detected.
Raises#
- BadRequest
If the file starts with no column names.
- geonature.core.imports.utils.preprocess_value(dataframe: pandas.DataFrame, field: geonature.core.imports.models.BibFields, source_col: str) pandas.Series [source]#
Preprocesses values in a DataFrame depending if the field contains multiple values (e.g. additional_data) or not.
Parameters#
- dataframepd.DataFrame
The DataFrame to preprocess the value of.
- fieldBibFields
The field to preprocess.
- source_colstr
The column to preprocess.
Returns#
- pd.Series
The preprocessed value.
- geonature.core.imports.utils.insert_import_data_in_transient_table(imprt: geonature.core.imports.models.TImports) int [source]#
Insert the data from the import file into the transient table.
Parameters#
- imprtTImports
current import
Returns#
- int
The last line number of the import file that was inserted.
- geonature.core.imports.utils.build_fieldmapping(imprt: geonature.core.imports.models.TImports, columns: Iterable[Any]) Tuple[Dict[str, Dict[str, Any]], List[str]] [source]#
Build a dictionary that maps the source column names to the corresponding field and values.
Parameters#
- imprtTImports
The import to check.
- columnsIterable[Any]
The columns to map.
Returns#
- tuple
A tuple containing a dictionary that maps the source column names to the corresponding field and values, and a list of the used columns.
- geonature.core.imports.utils.load_transient_data_in_dataframe(imprt: geonature.core.imports.models.TImports, entity: geonature.core.imports.models.Entity, source_cols: list, offset: int = None, limit: int = None)[source]#
Load data from the transient table into a pandas dataframe.
Parameters#
- imprtTImports
The import to load.
- entityEntity
The entity to load.
- source_colslist
The columns to load from the transient table.
- offsetint, optional
The number of rows to skip.
- limitint, optional
The maximum number of rows to load.
Returns#
- pandas.DataFrame
The dataframe containing the loaded data.
- geonature.core.imports.utils.update_transient_data_from_dataframe(imprt: geonature.core.imports.models.TImports, entity: geonature.core.imports.models.Entity, updated_cols: Set[str], dataframe: pandas.DataFrame)[source]#
Update the transient table with the data from the dataframe.
Parameters#
- imprtTImports
The import to update.
- entityEntity
The entity to update.
- updated_colslist
The columns to update.
- dfpandas.DataFrame
The dataframe to use for the update.
Notes#
The dataframe must have the columns “id_import” and “line_no”.
- geonature.core.imports.utils.generate_pdf_from_template(template: str, data: Any) bytes [source]#
Generate a PDF document from a template.
Parameters#
- templatestr
The name of the template file to use.
- dataAny
The data to pass to the template.
Returns#
- bytes
The PDF document as bytes.
- geonature.core.imports.utils.get_mapping_data(import_: geonature.core.imports.models.TImports, entity: geonature.core.imports.models.Entity)[source]#
Get the mapping data for a given import and entity.
Parameters#
- import_TImports
The import to get the mapping data for.
- entityEntity
The entity to get the mapping data for.
Returns#
- fieldsdict
A dictionary with the all fields associated with an entity (check gn_imports.bib_fields). This dictionary is keyed by the name field and valued by the corresponding BibField object.
- selected_fieldsdict
In the same format as fields, but only the fields contained in the mapping.
- source_colslist
List of fields to load in dataframe, mainly source column of non-nomenclature fields
- geonature.core.imports.utils.get_required(import_: geonature.core.imports.models.TImports, entity: geonature.core.imports.models.Entity)[source]#
- geonature.core.imports.utils.compute_bounding_box(imprt: geonature.core.imports.models.TImports, geom_entity_code: str, geom_4326_field_name: str, *, child_entity_code: str = None, transient_where_clause=None, destination_where_clause=None)[source]#
Compute the bounding box of an entity with a geometry in the given import, based on its entities tree (e.g. Station -> Habitat; Site -> Visite -> Observation).
Parameters#
- imprtTImports
The import to get the bounding box of.
- geom_entity_codestr
The code of the entity that contains the geometry.
- geom_4326_field_namestr
The name of the column in the geom entity table that contains the geometry.
- child_entity_codestr, optional
The code of the last child entity (of the geom entity) to consider when computing the bounding box. If not given, bounding-box will be computed only on the geom entity.
- transient_where_clausesqlalchemy.sql.elements.BooleanClauseList, optional
A where clause to apply to the query when computing the bounding box of a processed import.
- destination_where_clausesqlalchemy.sql.elements.BooleanClauseList, optional
A where clause to apply to the query when computing the bounding box of a finished import.
Returns#
- valid_bboxdict
The bounding box of all entities in the given import, in GeoJSON format.