All pages
Powered by GitBook
1 of 8

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Installation

To install the Lexset package, you can use the pip package manager. Open your command prompt or terminal and run the following command: Most recent version #:

Overview

The simulation class is part of the Lexset package and is used to manage and interact with simulations. It has the following attributes:

  • user_id (str): The user ID.

  • token (str): The user's access token.

  • organization_id (int, optional): The organization ID. Defaults to None.

Initialization

To initialize the simulation class, first import it from the Lexset package:

Then, provide a user ID and access token, and optionally an organization ID. If the organization ID is not provided, the default value will be 0, and a warning message will be displayed.

Example

Error Messages and Warnings

Warning message:

This warning occurs when the organizationID is not provided during the initialization of the simulation class. You can resolve this warning by either providing the organizationID during initialization or using the setOrganization_id method after the object is created.

setSimulation_id

This method sets the simulation ID for the simulation object.

Arguments

  • simulationID (int): The simulation ID.

Example

setOrganization_id

This method sets the organization ID for the simulation object.

Arguments

  • organizationID (int): The organization ID.

Example

By following the above examples and explanations, you can successfully install the Lexset package, initialize the simulation class, and set important information using the last two functions. Be aware of the possible warning messages, as they may indicate that certain inputs were not provided or need to be updated.

4.2.9
pip install lexset
from lexset import simulation
token = "your_access_token"
userID = "your_user_id"
organizationID = 12345


sim = simulation(token, userID, organizationID)
Organization ID not provided.
To set an organization ID, use the setOrganization_id method.
simulationID = 67890
sim.setSimulation_id(simulationID)
new_organizationID = 54321
sim.setOrganization_id(new_organizationID)

Methods

A comprehensive list of available functions in the new API version! We have curated a robust collection of methods and tools designed to empower you, our valued users, to seamlessly integrate our API into your applications and projects. This section will provide you with a detailed overview of each function, its purpose, and sample use cases to help you navigate and make the most of the enhanced capabilities. Our aim is to ensure a smooth and efficient experience as you explore the various offerings of this updated API, unlocking new possibilities and optimizing your workflows.

get_dataset_id

This method retrieves the dataset ID(s) associated with the simulation ID(s) of the simulation object.

Example

Error Messages and Warnings

Warning message:

This warning occurs when the simulation_id attribute is not set before calling the get_dataset_id method. To resolve this warning, use the setSimulation_id method to set the simulation ID(s) before calling the get_dataset_id method.

Error message:

This error occurs when the provided access token is invalid or has expired. To resolve this error, ensure that you are using a valid access token when initializing the simulation object.

Error message:

This error occurs when an unexpected HTTP status code is received from the server. To resolve this error, make sure the Lexset API is accessible and functioning properly, and ensure that the provided user and organization IDs are valid.

By following the above example and explanation, you can successfully use the get_dataset_id method to retrieve the dataset ID(s) associated with the simulation ID(s). Make sure to set the simulation_id attribute before calling this method and be aware of the possible error messages and warnings.

get_organization_simulations

This method retrieves a list of simulations for a given organization ID based on the specified state.

Arguments

  • state (str): The state of the simulations to be retrieved. Valid states are "RUNNING", "QUEUED", "COMPLETED", and "CREATED".

  • limit (int, optional): The maximum number of simulations to return. Defaults to 100.

Example

Error Messages and Warnings

Error message:

This error occurs when the provided state argument is not valid. To resolve this error, use a valid state ("RUNNING", "QUEUED", "COMPLETED", or "CREATED") when calling the get_organization_simulations method.

Error message:

This error occurs when the provided access token is invalid or has expired. To resolve this error, ensure that you are using a valid access token when initializing the simulation object.

Error message:

This error occurs when an unexpected HTTP status code is received from the server. To resolve this error, make sure the Lexset API is accessible and functioning properly, and ensure that the provided user and organization IDs are valid.

By following the above example and explanation, you can successfully use the get_organization_simulations method to retrieve a list of simulations for the given organization ID based on the specified state. Make sure to provide a valid state when calling this method and be aware of the possible error messages.

download

This method downloads the dataset associated with the simulation object's dataset ID(s) in parallel using multiple worker threads. The downloaded dataset is saved as a ZIP file.

Arguments

  • localPath (str, optional): The local path where the downloaded dataset will be saved. If not provided, the dataset will be saved in the current working directory. Defaults to "NONE".

  • workers (int, optional): The number of worker threads to use for parallel downloading. Defaults to 2.

Example

Error Messages and Warnings

Error message:

This error occurs when the simulation_id attribute is not set before calling the download method. To resolve this error, use the setSimulation_id method to set the simulation ID(s) before calling the download method.

Error message:

This error occurs when the provided access token is invalid or has expired. To resolve this error, ensure that you are using a valid access token when initializing the simulation object.

Error message:

This error occurs when an unexpected HTTP status code is received from the server. To resolve this error, make sure the Lexset API is accessible and functioning properly, and ensure that the provided user and organization IDs are valid.

Error message:

This error occurs when the simulation associated with the dataset is not complete. To resolve this error, wait for the simulation to complete before calling the download method.

By following the above example and explanation, you can successfully use the download method to download the dataset associated with the simulation object's dataset ID(s). Make sure to set the simulation_id attribute before calling this method and be aware of the possible error messages and warnings.

create_simulation

This method creates a new simulation by sending a POST request to the Lexset API. It takes the configuration from a file and encodes it in Base64 format.

Arguments

  • simulationPath (str): The path to the YAML configuration file for the simulation.

  • simulationName (str): The name of the simulation.

  • description (str): A brief description of the simulation.

Example

Error Messages and Warnings

Error message:

This error occurs when the provided access token is invalid or has expired. To resolve this error, ensure that you are using a valid access token when initializing the simulation object.

Error message:

This error occurs when an unexpected HTTP status code is received from the server. To resolve this error, make sure the Lexset API is accessible and functioning properly, and ensure that the provided user and organization IDs are valid.

By following the above example and explanation, you can successfully use the create_simulation method to create a new simulation with the specified configuration, name, description, node count, and number of images. Make sure to be aware of the possible error messages and warnings.


delete_simulation

This method deletes an existing simulation by sending a DELETE request to the Lexset API. It works with either a single simulation ID or a list of simulation IDs.

Example

Error Messages and Warnings

Error message:

This error occurs when the provided access token is invalid or has expired. To resolve this error, ensure that you are using a valid access token when initializing the simulation object.

Error message:

This error occurs when an unexpected HTTP status code is received from the server. To resolve this error, make sure the Lexset API is accessible and functioning properly, and ensure that the provided user and organization IDs are valid.

By following the above example and explanation, you can successfully use the delete_simulation method to delete an existing simulation or a list of simulations. Make sure to be aware of the possible error messages and warnings.

add_file

This method uploads a user file to the Lexset API. The supported file types are RELATIONSHIPS and COLORMAP.

Parameters

  • location: The path to the file that you want to upload.

  • type: The type of the file you are uploading. It can be either RELATIONSHIPS or COLORMAP.

Example

Error Messages and Warnings

Error message:

This error occurs when the file type is not specified or set to "None". To resolve this error, provide a valid file type in the type parameter.

Error message:

This error occurs when the provided access token is invalid or has expired. To resolve this error, ensure that you are using a valid access token when initializing the simulation object.

Error message:

This error occurs when an unexpected HTTP status code is received from the server. To resolve this error, make sure the Lexset API is accessible and functioning properly, and ensure that the provided user and organization IDs are valid.

By following the above example and explanation, you can successfully use the add_file method to upload a user file to the Lexset API. Make sure to be aware of the possible error messages and warnings.

active_nodes

This method returns the number of active nodes in the user's simulations.

Example

Return Value

This method returns an integer value representing the number of active nodes.

Error Messages and Warnings

Error message:

This error occurs when the provided access token is invalid or has expired. To resolve this error, ensure that you are using a valid access token when initializing the simulation object.

Error message:

This error occurs when an unexpected HTTP status code is received from the server. To resolve this error, make sure the Lexset API is accessible and functioning properly, and ensure that the provided user and organization IDs are valid.

By following the above example and explanation, you can successfully use the active_nodes method to get the number of active nodes in the user's simulations. Make sure to be aware of the possible error messages and warnings.

start

This method starts the simulation(s) associated with the current simulation object.

Example

Return Value

This method returns a string value "success" when the simulation is successfully started.

Error Messages and Warnings

Error message:

This error occurs when the provided access token is invalid or has expired. To resolve this error, ensure that you are using a valid access token when initializing the simulation object.

Error message:

This error occurs when an unexpected HTTP status code is received from the server. To resolve this error, make sure the Lexset API is accessible and functioning properly, and ensure that the provided user and organization IDs are valid, and the simulation ID is correct.

By following the above example and explanation, you can successfully use the start method to start the simulation(s) associated with the current simulation object. Make sure to be aware of the possible error messages and warnings.

dequeue

This method removes the simulation(s) associated with the current simulation object from the queue and sets it in a ready state. For more information on simulation states, please refer to the documentation.

Example

Return Value

This method returns a string value "success" when the simulation is successfully started.

Error Messages and Warnings

Error message:

This error occurs when the provided access token is invalid or has expired. To resolve this error, ensure that you are using a valid access token when initializing the simulation object.

Error message:

This error occurs when an unexpected HTTP status code is received from the server. To resolve this error, make sure the Lexset API is accessible and functioning properly, and ensure that the provided user and organization IDs are valid, and the simulation ID is correct.

By following the above example and explanation, you can successfully use the start method to start the simulation(s) associated with the current simulation object. Make sure to be aware of the possible error messages and warnings.

stop

This method stops the simulation(s) associated with the current simulation object.

Example

Return Value

This method returns a string value "success" when the simulation is successfully stopped.

Error Messages and Warnings

Error message:

This error occurs when the provided access token is invalid or has expired. To resolve this error, ensure that you are using a valid access token when initializing the simulation object.

Error message:

This error occurs when an unexpected HTTP status code is received from the server. To resolve this error, make sure the Lexset API is accessible and functioning properly, and ensure that the provided user and organization IDs are valid, and the simulation ID is correct.

By following the above example and explanation, you can successfully use the stop method to stop the simulation(s) associated with the current simulation object. Make sure to be aware of the possible error messages and warnings.

get_progress

This method retrieves the progress percentage of the simulation(s) associated with the current simulation object.

Example

Return Value

This method returns a JSON object containing the progress percentage of the simulation(s) associated with the current simulation object.

Error Messages and Warnings

Error message:

This error occurs when the provided access token is invalid or has expired. To resolve this error, ensure that you are using a valid access token when initializing the simulation object.

Error message:

This error occurs when an unexpected HTTP status code is received from the server. To resolve this error, make sure the Lexset API is accessible and functioning properly, and ensure that the provided user and organization IDs are valid, and the simulation ID is correct.

Warning message:

This warning is displayed when the get_progress method encounters an error. It suggests using the get_status method to check the simulation status for further information.

By following the above example and explanation, you can successfully use the get_progress method to retrieve the progress percentage of the simulation(s) associated with the current simulation object. Make sure to be aware of the possible error messages and warnings.

get_status

This method retrieves the status of the simulation(s) associated with the current simulation object, including whether the simulation(s) have started and whether they are complete.

Example

Return Value

This method returns a list of tuples containing the simulation ID, the completion status, and the started status of the simulation(s) associated with the current simulation object.

Error Messages and Warnings

Error message:

This error occurs when the provided access token is invalid or has expired. To resolve this error, ensure that you are using a valid access token when initializing the simulation object.

Error message:

This error occurs when an unexpected HTTP status code is received from the server. To resolve this error, make sure the Lexset API is accessible and functioning properly, and ensure that the provided user and organization IDs are valid, and the simulation ID is correct.

By following the above example and explanation, you can successfully use the get_status method to retrieve the status of the simulation(s) associated with the current simulation object. Make sure to be aware of the possible error messages and warnings.

gcp_transfer

The gcp_transfer method allows you to transfer the datasets generated by simulations to a Google Cloud Storage bucket.

NOTE: You must add your GCP credentials before you can use this feature. Please read the following instructions .

Parameters

  • bucketName (str): The name of the Google Cloud Storage bucket where the datasets will be transferred.

Example

Return Value

This method returns the parsed JSON response if the transfer is successful. If an error occurs, the method returns the HTTP status code.

Error Messages and Warnings

Error message:

This error occurs when the provided access token is invalid or has expired. To resolve this error, ensure that you are using a valid access token when initializing the simulation object.

Error message:

This error occurs when an unexpected HTTP status code is received from the server. To resolve this error, make sure the Lexset API is accessible and functioning properly, and ensure that the provided user and organization IDs are valid, and the simulation ID is correct.

To use the gcp_transfer method, follow the example and explanation provided, and make sure to be aware of the possible error messages and warnings. This method will help you transfer the generated datasets to your Google Cloud Storage bucket.

Merge Datasets

Combine multiple synthetic datasets into one

Merges multiple COCO (Common Objects in Context) datasets stored in different directories into a single, new dataset. The function randomly samples a given percentage of images (and their associated annotations) from each directory to be included in the merged dataset. The merged dataset will include the images, annotations, and categories from each of the original datasets.

Arguments:

nodeCount
(int): The number of nodes requested for the simulation.
  • numImages (int): The number of images to be generated in the simulation.

  • simulation manager
    Please see the credentials documentation for more details.
    here

    Compression Artifacts

    Tools for compression artifacts to your simulated data

    You can access the compression artifacts tools as follows.

    import lexset.compression_artifacts as ca
    
    # Provide the directory path where your images are located
    dir_path = "./data/"

    FFMPEG Encoding

    Compresses a sequence of images using ffmpeg and saves individual frames as PNG files

    # Call the function with desired parameters
    ca.ffmpeg_encode(
        dir_path=dir_path,
        framerate=30,  # Frame rate for the output video
        codec="libx264",  # Codec to use for the encoded video file
        extension=".mp4",  # Extension for the video file
        bitrate_kbps=5000,  # Target bitrate (kb/s) for the video file
        bitrate_kbps_min=5000,  # Minimum bitrate (kb/s) for the video file
        bitrate_kbps_max=5000,  # Maximum bitrate (kb/s) for the video file
        bufsize=5000,  # Buffer size of the H.264/H.265 video file
        ffmpeg_path="ffmpeg"  # Optional path to ffmpeg executable
    )

    Arguments:

    dir_path="" (default="") : Path to the directory containing the images to be processed framerate=30 (default=30) : Frames per second for the target video codec="libx264" (default="libx264") : Codec to use for the encoded video file. Corresponds to the options available with ffmpeg. extension=".mp4" (default=".mp4"):- Optional extension for the video file. This should correspond with the appropriate codec (e.g. ".mp4" for "libx264") bitrate_kbps=5000 (default=5000) : Target bitrate (kb/s) for the video file bitrate_kbps_min=5000 (default=5000): Minimum bitrate (kb/s) for the video file bitrate_kbps_max=5000 (default=5000): Maximum bitrate (kb/s) for the video file bufsize=5000 (default=5000): Buffer size of the H.264/H.265 video file ffmpeg_path="ffmpeg" (default="ffmpeg"): Optional path to ffmpeg executable. Only necessary if ffmpeg is not added to PATH

    json_dirs
    (type: list of strings)
    • Description: A list of directory paths where each directory contains a COCO-formatted JSON annotation file named "annotations.json" and associated image files.

    • Example: ["path/to/dataset1", "path/to/dataset2"]

  • percentages (type: list of integers or floats)

    • Description: A list of percentages specifying how much data to keep from each directory in json_dirs. The list length should match the length of json_dirs. Each percentage is between 0 and 100.

    • Example: [50, 60] (This will keep 50% of the data from the first directory and 60% from the second.)

  • output_json_path (type: string)

    • Description: The path where the merged JSON annotation file will be saved. The file will be named "annotations.json".

    • Example: "path/to/merged/annotations.json"

  • output_img_dir (type: string)

    • Description: The directory where the merged image files will be saved. If the directory does not exist, it will be created.

    • Example: "path/to/merged/images"

  • Example Use:

    sim.get_dataset_id()
    Simulation ID(s) not set. Use the setSimulation_id method to set the simulationID.
    Unauthorized: 401
    getDataset_id() Error: <status_code>
    state = "COMPLETED"
    limit = 10
    simulations = sim.get_organization_simulations(state, limit)
    #This will return JSON
    print(json.dumps(simulations, indent=4, sort_keys=True))
    Invalid State: <state>
    Valid States: RUNNING, COMPLETED, CREATED
    Unauthorized: 401
    get_organization_simulations() Error: <status_code>
    localPath = "downloads"
    workers = 4
    sim.download(localPath, workers)
    No simulation associated with this object. Please set a simulation with the setSimulation_id method.
    Unauthorized: 401
    download() Error: <status_code>
    Simulation(s) not complete. Please wait for simulation to complete before downloading.
    simulationPath = "path/to/simulation_config.yaml"
    simulationName = "My Simulation"
    description = "A sample simulation"
    nodeCount = 5
    numImages = 1000
    sim.create_simulation(simulationPath, simulationName, description, nodeCount, numImages)
    Unauthorized: 401
    create_simulation() Error: <status_code>
    sim.delete_simulation()
    Unauthorized: 401
    delete_simulation() Error: <status_code>
    sim.add_file("path/to/relationships_file.txt", type="RELATIONSHIPS")
    Please specify a file type, LIST_ALL, RELATIONSHIPS, or COLORMAP
    Unauthorized: 401
    add_file() Error: <status_code>
    sim.active_nodes()
    Unauthorized: 401
    active_nodes() Error: <status_code>
    sim.start()
    Unauthorized: 401
    start() Error: <status_code>
    sim.dequeue()
    Unauthorized: 401
    start() Error: <status_code>
    sim.stop()
    Unauthorized: 401
    stop() Error: <status_code>
    sim.get_progress()
    Unauthorized: 401
    get_progress() Error: <status_code>
    Check simulation status with the get_status method.
    sim.get_status()
    Unauthorized: 401
    get_status() Error: <status_code>
    sim.gcp_transfer("your-gcs-bucket-name")
    Unauthorized: 401
    getDataset_id() Error: <status_code>
    from lexset.LexsetManager import merge_datasets
    
    # Define the directories containing your COCO JSON files and images
    # add as many as you like
    json_dirs = ["D:/<path 1>", "D:/<path 2>"]
    
    # Define the percentage of data to keep from each directory
    percentages = [50, 50]  # 50% from the first directory, 60% from the second
    
    # Define paths to output JSON and image directory
    output_json_path = "D:/<path 3>/coco_annotations.json"
    output_img_dir = "D:/<path 3>/"
    
    # Merge the datasets
    merge_datasets(json_dirs, percentages, output_json_path, output_img_dir)

    Dataset Analysis

    Tools for reviewing your dataset structure and composition

    You can access the dataset analysis tools as follows.

    These functions are designed to analyze the synthetic datasets you generate and download through our Seahaven platform. Specifically, these functions will process the coco_annotations.json file in your dataset directory. These functions can also be used for analyzing any COCO JSON file.

    Datasets derived outside Seahaven:

    If you use this utility to analyze datasets generated outside the Seahaven platform. Please be sure to format your annotations in a single COCO JSON file. Similar to the ones provided in your datasets downloaded from the Seahaven platform. Additionally, All of your images must be in one directory and should only include images you wish to include in the analysis.

    Spatial Analysis:

    Performs spatial analysis on a dataset containing bounding box annotations of objects. The goal is to create heatmaps representing the distribution of object centers across different categories within the dataset.

    Sample output: The plot contains heatmaps for each category, representing the density distribution of object centers within bounding boxes for that category. The heatmap is created by calculating a 2D histogram of the object centers. This histogram counts the number of object centers that fall into each bin on the heatmap.

    Arguments:

    Bins: int, optional (default=50): This determines the number of equally spaced intervals over the range of the data. The more bins you use, the finer the granularity of the distribution representation. However, too many bins may overfit the data and exaggerate minor fluctuations. Choose this parameter according to the granularity of analysis required.

    Class Distribution:

    Analyzes a dataset containing object annotations and generates a bar plot representing the distribution of different classes (categories) present in the dataset.

    Relative Scale:

    Performs relative size analysis on a dataset containing object annotations with bounding box information. The goal is to analyze the relative size of objects within each category and visualize the distribution of relative sizes using histograms.

    Arguments:

    Bins: int, optional (default=50): This determines the number of equally spaced intervals over the range of the data. The more bins you use, the finer the granularity of the distribution representation. However, too many bins may overfit the data and exaggerate minor fluctuations. Choose this parameter according to the granularity of analysis required.

    Bounding Box Areas:

    Analyzes a dataset containing object annotations with bounding box information. The goal is to compute the bounding box areas for each object category and visualize the distribution of these areas using histograms.

    Arguments:

    Bins: int, optional (default=50): This determines the number of equally spaced intervals over the range of the data. The more bins you use, the finer the granularity of the distribution representation. However, too many bins may overfit the data and exaggerate minor fluctuations. Choose this parameter according to the granularity of analysis required.

    Aspect Ratio Distribution:

    Performs aspect ratio analysis on a dataset containing object annotations with bounding box information. The goal is to calculate the aspect ratio of bounding boxes for each object category and visualize the distribution of these aspect ratios using histograms.

    Arguments:

    Bins: int, optional (default=50): This determines the number of equally spaced intervals over the range of the data. The more bins you use, the finer the granularity of the distribution representation. However, too many bins may overfit the data and exaggerate minor fluctuations. Choose this parameter according to the granularity of analysis required.

    Pixel Intensity Distribution:

    Performs pixel intensity distribution analysis for each color channel and plots the distribution.

    Arguments:

    Type: String, optional (default=Lexset): If set to the default value of Lexset it will automatically filter out all RGB images with the standard Lexset naming convention. If "Type" is set to "other" it will automatically analyze every image in the directory.

    Power Spectral Density:

    Performs a comparative analysis between two directories containing image files by computing their average Power Spectral Density (PSD).

    The function then generates three plots:

    • Average PSD Comparison: This plot contains two subplots, one showing the log-transformed average PSD of images in self.dir1 and the other showing the log-transformed average PSD of images in compare_dir.

    • Difference Map: This plot shows the absolute difference between the two average PSDs. The "hot" colormap is used to highlight the areas where the PSDs differ most.

    • Ratio Map: This plot shows the ratio of the average PSDs. The "coolwarm" colormap is used to highlight the areas of ratio discrepancies. Division by zero is avoided by adding a small constant (1e-8).

    Arguments:

    compare_dir: String - Path to the directory containing data you want to compare your real data against.

    Structural Similarity Index (SSIM):

    The SSIM index ranges from -1 to 1, with a value of 1 indicating that the test image is identical to the reference image. Higher SSIM values generally indicate greater structural similarity and less distortion or difference between the images. This operation will write a JSON file with the results.

    Arguments:

    compare_dir (str): The directory containing real images. target_size (tuple, optional): The target size to resize images to, for SSIM comparison. Defaults to (256, 256).

    Fréchet Inception Distance (FID):

    Fréchet Inception Distance (FID) is a metric used to evaluate the quality of images generated by generative models, relative to real images. It was proposed as an improvement over older metrics like Inception Score. The FID measures the similarity between two sets of images by examining how a pre-trained model (InceptionV3 in this case) interprets them. The idea is to see if the generated or synthetic images are similar to real images in the eyes of the InceptionV3 model.

    Arguments:

    compare_dir (str): The directory containing real images.

    from lexset.review import analysis
    
    dir_path = "D:/<PATH TO DATASET>/"
    
    # Create an instance of the 'analysis' class
    sample_data = analysis(dir_path)

    Python Package

    We are excited to announce the beta release of our latest API version, now available for testing upon request! This new version offers enhanced features, improved performance, and a refined user experience to help you achieve your goals more effectively. As we continue to fine-tune the API, we invite you to access it and share your valuable feedback with us. Your insights will be instrumental in shaping the final release, ensuring it meets the highest standards and exceeds your expectations. To request access to the beta, simply reach out to our support team and join us in shaping the future of our API.

    Sample output
    Sample output
    The funciton will also generate a visualization of the embeddings using PCA
    Sample dataset directory:
    
    DATASET_NAME/ 
    │ 
    ├── coco_annotations.json 
    │ 
    └── images001.png
    │ 
    └── images002.png
    │ 
    └── images003.png
    │ 
    └── etc...
    // SAMPLE COCO FILE 
    
    {
      "info": {
        "description": "Example Dataset",
        "version": "1.0",
        "year": 2023,
        "contributor": "Your Name",
        "date_created": "2023-08-31"
      },
      "licenses": [
        {
          "id": 1,
          "name": "License Type",
          "url": "http://www.___.com/"
        }
      ],
      "images": [
        {
          "id": 1,
          "width": 640,
          "height": 480,
          "file_name": "image1.jpg",
          "license": 1,
          "date_captured": "2023-08-31"
        }
        // Additional image entries...
      ],
      "annotations": [
        {
          "id": 1,
          "image_id": 1,
          "category_id": 1,
          "segmentation": [[ /* polygon */]],
          "area": /* area */,
          "bbox": [ /* bounding box */],
          "iscrowd": 0
        }
        // Additional annotation entries...
      ],
      "categories": [
        {
          "id": 1,
          "name": "person",
          "supercategory": "human"
        },
        // Additional category entries...
      ]
    }
    sample_data.spatial_analysis()
    sample_data.class_distribution()
    sample_data.relative_scale()
    sample_data.bounding_box_areas()
    sample_data.aspect_ratio_distribution()
    sample_data.plot_pixel_intensity_distribution()
    
    #or 
    
    sample_data.plot_pixel_intensity_distribution("other")
    dir1 = "D:/5869/"
    dir2 = "D:/real_img/"
    
    # Create an instance of the 'analysis' class
    sample_data = analysis(dir1)
    
    sample_data.plot_comparative_psd(compare_dir=dir2)
    dir1 = "D:/5869/"
    dir2 = "D:/real_img/"
    
    # Create an instance of the 'analysis' class
    sample_data = analysis(dir1)
    sample_data.compare_ssim_distributions(compare_dir=dir2,target_size=(256, 256))
    dir1 = "D:/5869/"
    dir2 = "D:/real_img/"
    
    # Create an instance of the 'analysis' class
    sample_data = analysis(dir1)
    sample_data.calculate_FID(compare_dir=dir2)

    Credentials

    The credentials library provides a convenient way to manage credentials and link external resources, such as your Google Cloud Platform (GCP) account, to your Lexset account. The linkAccounts class contains a method called link_gcp_account, which accepts the file path to your GCP service account JSON, a unique name for the credential, and your Lexset access token. By calling this method, your GCP credentials will be linked to your Lexset account.

    Class: LinkAccounts

    This class provides methods to link your Google Cloud Platform (GCP) account.

    Method: link_gcp_account

    This method links your GCP account by providing the service account JSON file path, a unique name for the credential, and your Lexset access token.

    Arguments

    • service_account_json_file_path (str): The path to the service account JSON file downloaded from GCP.

    • name_of_credential (str): A unique name for the credential.

    • token (str): Your Lexset access token.

    Example:

    This will link your GCP account with the provided credential information.

    Domain Matching

    A collection of utilities for matching pixelwise features between datasets.

    The library, lexset_dataset_bridge, offers functionalities for image enhancement and manipulation, catering specifically to the domain of dataset augmentation and improvement. It provides three core functions:

    1. adaptive_histogram_matching utilizes the Adaptive Histogram Equalization technique to adjust the brightness and color of synthetic images, aligning them with real images.

    2. color_transfer_mean harmonizes the colors of synthetic images to match real images by transferring the mean and standard deviation of color values.

  • transfer_shot_noise introduces a realistic shot noise, derived from real images, to synthetic images.

  • All three functions aim to narrow the gap between synthetic and real image datasets, optimizing the former for training machine learning models with improved real-world accuracy.

    Installation

    The latest version is 1.5.1

    Note: The use of lexset_dataset_bridge is subject to legal terms and conditions which may be found in the license file contained within the package.

    Adaptive Histogram Matching

    • Purpose: This function is designed to refine the brightness and color distribution of synthetic images, making them closely resemble real images.

    • Mechanism: It begins by converting the images to the HSV color space, which separates image intensity (Value) from color information. The average brightness and color values of both the synthetic and real images are computed. These averages are then used to derive scaling factors that adjust the brightness and color of the synthetic images to align with those of the real images. Additionally, the function employs the Adaptive Histogram Equalization technique, which enhances the contrast in images based on local regions, rather than the whole image. This ensures finer control over contrast adjustments and prevents over-amplification of noise.

    • Applications: Such a function can be invaluable in scenarios where synthetic images generated by computer graphics or simulations need to be harmonized with real photographs, ensuring a seamless blend of both in datasets.

    Color Transfer (mean)

    • Purpose: The primary goal of this function is to make the color distribution of synthetic images akin to real ones.

    • Mechanism: The function calculates the mean and standard deviation of colors in both the synthetic and real images. By adjusting the synthetic image colors using these statistical measures, it's able to transfer the overall color feel from the real image to the synthetic one. The process is efficient and maintains the inherent texture and content of the synthetic image, only altering its color palette.

    • Applications: This is particularly useful for scenarios like style transfer, where one might want to impose the color characteristics of one image onto another. It can also be used in dataset augmentation to ensure that synthetic datasets are colored in ways consistent with real-world data.

    Transfer Shot Noise

    • Purpose: To introduce a touch of realistic noise to synthetic images by borrowing shot noise characteristics from real images.

    • Mechanism: This function operates by first calculating the shot noise present in real images. Shot noise is a type of electronic noise that occurs due to the discrete nature of electric charge. Once derived, this noise is then superimposed onto the synthetic images. By doing so, the synthetic images inherit some of the natural imperfections present in real photographs, making them appear more lifelike.

    • Applications: In the domain of machine learning, introducing such noise can be beneficial. Models trained on noisy data often generalize better to real-world scenarios as they become accustomed to handling natural imperfections in the data. This function aids in achieving such a training environment, especially when the initial dataset is predominantly synthetic and lacks real-world imperfections.

    from lexset.credentials import linkAccounts
    
    
    service_account_path = "/path/to/your/gcp_service_account.json"
    credential_name = "your_unique_credential_name"
    lexset_token = "your_lexset_access_token"
    
    linkAccounts.link_gcp_account(service_account_path, credential_name, lexset_token)
    pip install lexset-dataset-bridge
    import lexset_dataset_bridge as bridge
    
    synthetic_data_dir = "E:/path"
    real_data_dir = "E:/path"
    output_data_dir = "E:/path"
    
    bridge.adaptive_histogram_matching(synthetic_data_dir,real_data_dir,output_data_dir)
    import lexset_dataset_bridge as bridge
    
    synthetic_data_dir = "E:/path"
    real_data_dir = "E:/path"
    output_data_dir = "E:/path"
    
    bridge.color_transfer_mean(synthetic_data_dir,real_data_dir,output_data_dir)
    import lexset_dataset_bridge as bridge
    
    synthetic_data_dir = "E:/path"
    real_data_dir = "E:/path"
    output_data_dir = "E:/path"
    
    bridge.transfer_shot_noise(synthetic_data_dir,real_data_dir,output_data_dir)