# Extending
**NB** The way in which TopoStats can be extended is very much a work in-progress. This page is an attempt to document and keep up-to-date the process of doing so but is subject to change and may be outdated at any given point in time.
TopoStats design is modular with the intention that users can write extensions to carry out different workflows contingent on the types of samples being imaged. Here we describe how to add your own module. ## Core Features All AFM images need to be filtered, flattened and optionally have scars removed. This is core functionality carried out by the `topostats.filters` module and will be required by all packages that are developed to extend TopoStats. ## Entry Points Primarily TopoStats was designed with batch processing of multiple images in mind and as such it uses the command line for invocation and processing of multiple images. To this end you should include entry points in your module. For consistency it is recommended that you create a `/entry_point.py` which has an argument parser and sub-parsers for each of the steps you wish to create. These are added under the `project.scripts` configuration section of _your_ projects `pyproject.toml`. Your parser should include the common options to TopoStats (see the arguments added to `parser` in `topostats/entry_point.py`) and the `filter` sub-parser (see the arguments added to `parser.filter` in `topostats/entry_point.py`). ### Custom Configuration TopoStats uses [YAML][yaml] configuration files to specify the options. In the module you develop you should include a `default_config.yaml` and copy the base configuration parameters and `filters:` section from `topostats/default_config.yaml` as a starting point as all images require flattening. You may wish to remove the configuration for thresholding though as this is not always required. You should, for user convenience add a sub-parser to `create-config` using the example in `topostats/entry_point.py`. This will allow users to run ` create-config --config ` and it will, by default, write a copy of `default_config.yaml` to the `config.yaml` file of the current directory (users can modify the output filename with the ` create-config --filename my_cutsom_config.yaml` should they wish to). You shouldn't have to do anything else, TopoStats will look for the `default_config.yaml` in your package and write that to disk. #### Writing Configuration Files TopoStats includes the `io.write_yaml()` function to write the YAML configuration alongside all output in the output directory so that users know what parameters were used in processing their images. Packages using and extending TopoStats can leverage this function to write _their_ configuration files too. In order to customise the header of this file users should define `CONFIG_DOCUMENTATION_REFERENCE` in the `__init__.py`, import it where required and define a custom `HEADER_MESSAGE`. **Example `__init__.py`** ```python from importlib.metadata import version import snoop from packaging.version import Version # Disable TensorFlow warnings os.environ["TF_CPP_MIN_LOG_LEVEL"] = "3" __version__ = version("afmslicer") __release__ = ".".join(__version__.split(".")[:-2]) AFMSLICER_VERSION = Version(__version__) if AFMSLICER_VERSION.is_prerelease and AFMSLICER_VERSION.is_devrelease: AFMSLICER_BASE_VERSION = str(AFMSLICER_VERSION.base_version) AFMSLICER_COMMIT = str(AFMSLICER_VERSION).split("+g")[1] else: AFMSLICER_BASE_VERSION = str(AFMSLICER_VERSION) AFMSLICER_COMMIT = "" CONFIG_DOCUMENTATION_REFERENCE = """# For more information on configuration and how to use it: # https://afm-spm.github.io/AFMslicer/main/configuration.html\n""" CONFIG_DOCUMENTATION_REFERENCE += f"# AFMSlicer version : {AFMSLICER_BASE_VERSION}\n" CONFIG_DOCUMENTATION_REFERENCE += f"# Commit: {AFMSLICER_COMMIT}\n" ``` **Example custom `HEADER_MESSAGE`** ```python from topostats.io import get_date_time, write_yaml from afmslicer import CONFIG_DOCUMENTATION_REFERENCE HEADER_MESSAGE == f"# Configuration from AFMSlicer run complete : {get_date_time()}\n{CONFIG_DOCUMENTATION_REFERENCE}" def some_function() -> None: config_data = get_data( package=afmslicer.__package__, resource="default_config.yaml" ) config = yaml.full_load(config_data) write.yaml(config, output_dir=config["output_dir"], header_message=HEADER_MESSAGE) ``` This example results in the following header when running `some_function()` ```yaml # Configuration from AFMSlicer run complete : 2026-03-06 12:40:32 # For more information on configuration and how to use it: # https://afm-spm.github.io/AFMslicer/main/configuration.html # AFMSlicer version : 0.1 # Commit: cd24434e2.d20251114 # TopoStats version: 2.4.1 # Commit: 53b1eb591.d20260306 ``` ## Tests Functions and methods should follow the [single responsibility principle][srp] and perform very small focused calculations. This means that [unit tests][unit_tests] can be written to test the function does what it purports to. Ideally these should be [parameterised][param-tests] to cover all the possible combination of options. For example if a function takes a boolean argument (i.e. `True` or `False`) and this influences whether a particular element of calculation is undertaken and in turn the output differs, then tests should cover both scenarios. ### Regression Tests In a number of places we perform [regression tests][regress-test] (also known as integration tests) which tests how several components perform. Typically this is for whole classes/modules which perform a bunch of different steps in the processing and the results are compared to what is expected. To do this we have moved to using the [Syrupy][syrupy] package which writes "snapshots" of tests to the `__snapshot__/test_.ambr` against which a re-run of tests are compared. One of the challenges here is the precision of values written to these files and the fact that precision very occasionally varies between operating systems. To deal with this some of the values that do vary are separated from the rest of the values being tested and the [`matcher`][matcher] argument is used. For an example of how this is used see this [thread][matcher_example] or the `tests/test_processing.py::test_grainstats()` test. #### Paths in output The `TopoStats.img_path` holds the path to the file that is being tested. This will vary between the machine the tests are being run on, whether that is a personal laptop or one of the Continuous Integration runners on GitHub. As a consequence any regression test which includes this will fail if it includes the `img_path` or as it is called in dataframes/`.csv` files the `basename`. The solution, when developing tests that might include these fields, is to set `TopoStats.img_path = None` for `TopoStats` objects and to drop them from data frames (e.g. `df.drop(["basename"], axis=1, inplace=True)`). #### Paths - Windows Another consideration in writing regression tests is that TopoStats makes heavy usage of the [pathlib][pathlib] module. On POSIX systems such as GNU/Linux and OSX this results in path objects of type `PosixPath`, but because Microsoft Windows is not POSIX compliant tests that compare such values will fail. The solution to this implemented in `tests/conftest.py` is to mask `pathlib.PosixPath` to be `pathlib.WindowsPath` ```python import pathlib import platform if platform.system() == "Windows": pathlib.PosixPath = pathlib.WindowsPath ``` [matcher]: https://syrupy-project.github.io/syrupy/#matcher [matcher_example]: https://github.com/syrupy-project/syrupy/issues/913 [param-tests]: https://blog.nshephard.dev/posts/pytest-param/ [pathlib]: https://docs.python.org/3/library/pathlib.html [regress-test]: https://en.wikipedia.org/wiki/Regression_testing [srp]: https://en.wikipedia.org/wiki/Single-responsibility_principle [syrupy]: https://syrupy-project.github.io/syrupy/ [unit_tests]: https://en.wikipedia.org/wiki/Unit_testing [yaml]: https://yaml.org