hgp.match package

Submodules

hgp.match.matching

Methods to facilitate the formulation and solution to a specific flavor of the Hospital/Resident matching problem used for matching Hidden Genius Project (HGP) mentors to mentees.

The conventional hopsital/resident matching problem requires a secondary procedure for handling unmatched residents, and rankings must be complete (every resident ranked by a hospital must rank the hospital and every hospital ranked by a resident must rank the resident). This module exposes functionality for normalizing partial “hospital” (mentor) and “resident” (mentee) rankings into a total ordering that’s guaranteed to match every mentee to a mentor, provided that there’s enough aggregate mentor capacity. Furthermore, secondary matching is done in a way that ensures optimal outcomes when one side hasn’t explicitly ranked the other.

exception hgp.match.matching.InvalidProblemStatement

Bases: Exception

The exception raised when a problem formulation is invalid.

class hgp.match.matching.MatchStatement(mentee_rankings: Dict[str, List[str]], mentor_rankings: Dict[str, List[str]], mentor_capacities: Dict[str, int])

Bases: object

A simple data class for representing a mentee/mentor match problem statement

Variables
mentee_rankings: Dict[str, List[str]]
mentor_capacities: Dict[str, int]
mentor_rankings: Dict[str, List[str]]
hgp.match.matching.from_csv_files(mentee_rankings_path: str, mentor_rankings_path: str, mentor_capacities_path: str)hgp.match.matching.MatchStatement

Formulates a MatchStatement from csv inputs.

Ranking files should have the following format:

mentee_rankings.csv
S1,M2,M1,M3
S2,M3
S3,M1,M2

mentor_rankings.csv
M1,S1,S2,S3
M2,S3,S2,S1
M3,S1,S3

mentor_capacities.csv
M1,2
M2,1
M3,3

The names are arbitrary (S was chosen to represent students/mentees and M was chosen to represent mentors) but they must be consistent e.g., if S1 ranks M1, M1 should appear as an entry of the first column in mentor_capacities.csv, and if M1 ranks S3, S3 must appear in the first column of mentee_rankings.csv.

Variables
  • mentee_rankings_path – A path to the mentee ranking csv file

  • mentor_rankings_path – A path to the mentor ranking csv file

  • mentor_capacities_path – A path to the mentor capacities csv file

Returns

A MatchStatement formulated from the given inputs

Return type

match_statement

Raises

InvalidProblemStatement – If the ranking is invalid

See also

validate_and_transform()

Which does all of the work after csv files are read

hgp.match.matching.poset_to_ordered(statement: hgp.match.matching.MatchStatement)hgp.match.matching.MatchStatement

Expands a partial ordering wherein mentees have not necessarily ranked every mentor (and vice versa) into a total order. The expansion is done in a way that ensures fairness but results in optimal stable marriage outcomes when one party expresses indifference in the face of another party’s preference.

Variables

statement – A match statement that need not contain a total ordering.

Returns

A MatchStatement formulated from the given inputs

Return type

match_statement

hgp.match.matching.solve(mentee_rankings_path: str, mentor_rankings_path: str, mentor_capacities_path: str, result_path: str, mentee_optimal=True)

Solves the mentor/mentee matching problem, taking csv files as an input and outputing the result as another csv file. See from_csv_files() and solve_from_poset_problem() for additional formats on the expected input and output formats.

The csv file written to result_path will contain entries of the form:

results.csv
M1,S2,S3
M2,S1
M3,S4,S5,S6

The first column is the name of a mentor; each subsequent column is the name of a mentee for that mentor. In the example above, M1 is mentoring S2 and S3.

hgp.match.matching.solve_from_poset_problem(poset_problem: hgp.match.matching.MatchStatement, mentee_optimal=True) → Dict[str, List[str]]

Finds a stable marriage using the modified Gale–Shapley algorithm, given a class:MatchStatement that’s a total ordering. This ordering can be constructed manually, or expanded from a partial ordering via poset_to_ordered().

Variables
  • poset_problem – A totally ordered statement of the matching problem

  • mentee_optimal – Hospital/resident style stable marrage problems are always optimal for one side or the other. By default this function optimizes for mentee outcomes.

Returns

A stable marriage matching. Each key corresponds to a mentor; the list of values are their mentees.

Return type

matching

Raises

Exception – If poset_problem isn’t a valid total ordering

hgp.match.matching.validate_and_transform(mentee_rankings: List[List[str]], mentor_rankings: List[List[str]], mentor_capacities: List[Tuple[str, Union[str, int]]])hgp.match.matching.MatchStatement

Validates and converts a matrix representation of the match problem into the canonical form that’s used elsewhere.

In mentor and mentee rankings, note that rank is implicit, and specified left-to-right. For example:

[["S1", "M2", "M1]] means that S1 has ranked M2 first and M1 second.

The mentor_capacities attribute is an Nx2 matrix. The first column of each row is the name of the mentor. The second is their mentee capacity.

Variables
  • mentee_rankings – Mentee rankings of mentors in matrix form

  • mentor_rankings – Mentor rankings of mentees in matrix form

  • mentor_capacities – Mentor capacities in matrix form.

Returns

A MatchStatement formulated from the given inputs

Return type

match_statement

Raises

InvalidProblemStatement – If the ranking is invalid

hgp.match.matching.validate_rankings(mentee_rankings: Mapping[str, Union[Tuple[int, List[str]], List[str]]], mentor_rankings: Mapping[str, Union[Tuple[int, List[str]], List[str]]], mentor_capacities: Dict[str, int])

Verifies that a ranking is structurally correct. The following invariants must hold:

  • Every mentor ranked by a mentee must exist in the key set of mentor_capacities

  • Every mentee ranked by a mentor must exist in the key set of mentee_rankings

  • Rankings should never contain duplicate entries

If a mentor or mentee is missing validate_rankings will attempt to suggest a similar name; if no names are sufficiently similar, the suggestion will be UNKNOWN

Parameters
  • mentee_rankings – Mentee rankings of mentors

  • mentor_rankings – Mentor rankings of mentees

  • mentor_capacities – The max number of mentees assignable to a mentor

Raises

InvalidProblemStatement – If the ranking is invalid

Module contents