# 11. SOM Matching space by exact attributes

Date: 2020-11-17
Driver: Harsh S. Kulshrestha
Approver: Wisen Tanasa

# Status

Accepted

# Context

As part of enhancing the quality of our data, we have decided to allow the admin to upload space data in bulk using the provider portal. While uploading data to our inventory is a straightforward task, we would like to make sure that it's easier for both the admin and the providers to make complete sense of data provided in the sheet so as to ease up the process of making changes in the inventory data as and when necessary. As a result, there needs to be a way to easily identify spaces using the information provided.

# Decision

For certain providers, identification of spaces can be done by using an id (providerSpaceId), while for other providers it might not be present in the data sent by the providers to us. While the combination of buildingId and providerSpaceId is a good identifier for spaces, we also want to make sure that we are able to uniquely identify spaces that do not have a providerSpaceId. A possible solution to achieve this is to use a combination of attributes that uniquely identify a space. One such combination is - buildingId, capacity, price and availableFromDate.

Any space will have a unique buildingId, capacity and price. And while two similar spaces in a building might have the same capacity and price, it's quite rare that they will have the same availableFromDate. Hence this is a good candidate to serve as a unique identifier for a space.

We will use the providerSpaceId for spaces in which it is provided by the providers. For the remaining spaces, we will derive the id using a combination/hash of buildingId, capacity, price and availableFromDate.

# Consequences

By utilising the existing providerSpaceId as the unique identifier for spaces in Upmo, we are ensuring the sanctity of the id used by the providers to identify their spaces within a building. At the same time, we will be able to identify and update the existing spaces instead of creating new ones. In cases where we end up creating identifiers using the combination/hash of buildingId, capacity, price and availableFromDate, we are able to uniquely identify spaces. But in scenarios where there is a change in the price of an existing space, we will not be able to identify such a space in our system and will create a new space for the same. As a result, the url the user might have shortlisted for the earlier space will no longer exist.

This sparks a discussion about identifying spaces where there might be a marginal change in either the price or capacity. Although this can be achieved by maintaining a threshold change within which spaces can be identified, it might involve some level of complexity. By bypassing and delegating such matches for later, we are saving up on human and technical cost.

While this is not something that is ideal, it's quite rare that a provider makes changes to price or capacity. Moreover, instead of showing a 404 for that space, we will show a suggested list of spaces within the building the user might be interested in. Hence we are okay to live with this logic for now. However, utilising the id provided by the providers poses a risk to the integrity of data in our system in cases where providers aren't using their ids well.