geonature.core.imports.utils#

Attributes#

Classes#

ImportStep

Enum where members are also (and must be) ints

Functions#

clean_import(→ None)

Clean an import at a specific step.

get_file_size(→ int)

Get the size of a file in bytes.

detect_encoding(→ str)

Detects the encoding of a file.

detect_separator(→ Optional[str])

Detects the delimiter used in a CSV file.

preprocess_value(→ pandas.Series)

Preprocesses values in a DataFrame depending if the field contains multiple values (e.g. additional_data) or not.

insert_import_data_in_transient_table(→ int)

Insert the data from the import file into the transient table.

build_fieldmapping(→ Tuple[Dict[str, Dict[str, Any]], ...)

Build a dictionary that maps the source column names to the corresponding field and values.

load_transient_data_in_dataframe(imprt, entity, ...[, ...])

Load data from the transient table into a pandas dataframe.

update_transient_data_from_dataframe(imprt, entity, ...)

Update the transient table with the data from the dataframe.

generate_pdf_from_template(→ bytes)

Generate a PDF document from a template.

get_mapping_data(import_, entity)

Get the mapping data for a given import and entity.

get_required(import_, entity)

compute_bounding_box(imprt, geom_entity_code, ...[, ...])

Compute the bounding box of an entity with a geometry in the given import, based on its

Module Contents#

class geonature.core.imports.utils.ImportStep[source]#

Bases: enum.IntEnum

Enum where members are also (and must be) ints

UPLOAD = 1[source]#
DECODE = 2[source]#
LOAD = 3[source]#
PREPARE = 4[source]#
IMPORT = 5[source]#
geonature.core.imports.utils.generated_fields[source]#
geonature.core.imports.utils.clean_import(imprt: geonature.core.imports.models.TImports, step: ImportStep) None[source]#

Clean an import at a specific step.

Parameters#

imprtTImports

The import to clean.

stepImportStep

The step at which to clean the import.

geonature.core.imports.utils.get_file_size(file_: IO) int[source]#

Get the size of a file in bytes.

Parameters#

file_IO

The file to get the size of.

Returns#

int

The size of the file in bytes.

geonature.core.imports.utils.detect_encoding(file_: IO) str[source]#

Detects the encoding of a file.

Parameters#

file_IO

The file to detect the encoding of.

Returns#

str

The detected encoding. If no encoding is detected, then « UTF-8 » is returned.

geonature.core.imports.utils.detect_separator(file_: IO, encoding: str) str | None[source]#

Detects the delimiter used in a CSV file.

Parameters#

file_IO

The file object to detect the delimiter of.

encodingstr

The encoding of the file.

Returns#

Optional[str]

The delimiter used in the file, or None if no delimiter is detected.

Raises#

BadRequest

If the file starts with no column names.

geonature.core.imports.utils.preprocess_value(dataframe: pandas.DataFrame, field: geonature.core.imports.models.BibFields, source_col: str) pandas.Series[source]#

Preprocesses values in a DataFrame depending if the field contains multiple values (e.g. additional_data) or not.

Parameters#

dataframepd.DataFrame

The DataFrame to preprocess the value of.

fieldBibFields

The field to preprocess.

source_colstr

The column to preprocess.

Returns#

pd.Series

The preprocessed value.

geonature.core.imports.utils.insert_import_data_in_transient_table(imprt: geonature.core.imports.models.TImports) int[source]#

Insert the data from the import file into the transient table.

Parameters#

imprtTImports

current import

Returns#

int

The last line number of the import file that was inserted.

geonature.core.imports.utils.build_fieldmapping(imprt: geonature.core.imports.models.TImports, columns: Iterable[Any]) Tuple[Dict[str, Dict[str, Any]], List[str]][source]#

Build a dictionary that maps the source column names to the corresponding field and values.

Parameters#

imprtTImports

The import to check.

columnsIterable[Any]

The columns to map.

Returns#

tuple

A tuple containing a dictionary that maps the source column names to the corresponding field and values, and a list of the used columns.

geonature.core.imports.utils.load_transient_data_in_dataframe(imprt: geonature.core.imports.models.TImports, entity: geonature.core.imports.models.Entity, source_cols: list, offset: int = None, limit: int = None)[source]#

Load data from the transient table into a pandas dataframe.

Parameters#

imprtTImports

The import to load.

entityEntity

The entity to load.

source_colslist

The columns to load from the transient table.

offsetint, optional

The number of rows to skip.

limitint, optional

The maximum number of rows to load.

Returns#

pandas.DataFrame

The dataframe containing the loaded data.

geonature.core.imports.utils.update_transient_data_from_dataframe(imprt: geonature.core.imports.models.TImports, entity: geonature.core.imports.models.Entity, updated_cols: Set[str], dataframe: pandas.DataFrame)[source]#

Update the transient table with the data from the dataframe.

Parameters#

imprtTImports

The import to update.

entityEntity

The entity to update.

updated_colslist

The columns to update.

dfpandas.DataFrame

The dataframe to use for the update.

Notes#

The dataframe must have the columns “id_import” and “line_no”.

geonature.core.imports.utils.generate_pdf_from_template(template: str, data: Any) bytes[source]#

Generate a PDF document from a template.

Parameters#

templatestr

The name of the template file to use.

dataAny

The data to pass to the template.

Returns#

bytes

The PDF document as bytes.

geonature.core.imports.utils.get_mapping_data(import_: geonature.core.imports.models.TImports, entity: geonature.core.imports.models.Entity)[source]#

Get the mapping data for a given import and entity.

Parameters#

import_TImports

The import to get the mapping data for.

entityEntity

The entity to get the mapping data for.

Returns#

fieldsdict

A dictionary with the all fields associated with an entity (check gn_imports.bib_fields). This dictionary is keyed by the name field and valued by the corresponding BibField object.

selected_fieldsdict

In the same format as fields, but only the fields contained in the mapping.

source_colslist

List of fields to load in dataframe, mainly source column of non-nomenclature fields

geonature.core.imports.utils.get_required(import_: geonature.core.imports.models.TImports, entity: geonature.core.imports.models.Entity)[source]#
geonature.core.imports.utils.compute_bounding_box(imprt: geonature.core.imports.models.TImports, geom_entity_code: str, geom_4326_field_name: str, *, child_entity_code: str = None, transient_where_clause=None, destination_where_clause=None)[source]#

Compute the bounding box of an entity with a geometry in the given import, based on its entities tree (e.g. Station -> Habitat; Site -> Visite -> Observation).

Parameters#

imprtTImports

The import to get the bounding box of.

geom_entity_codestr

The code of the entity that contains the geometry.

geom_4326_field_namestr

The name of the column in the geom entity table that contains the geometry.

child_entity_codestr, optional

The code of the last child entity (of the geom entity) to consider when computing the bounding box. If not given, bounding-box will be computed only on the geom entity.

transient_where_clausesqlalchemy.sql.elements.BooleanClauseList, optional

A where clause to apply to the query when computing the bounding box of a processed import.

destination_where_clausesqlalchemy.sql.elements.BooleanClauseList, optional

A where clause to apply to the query when computing the bounding box of a finished import.

Returns#

valid_bboxdict

The bounding box of all entities in the given import, in GeoJSON format.