Approximate a function given pairwise comparison data of entities with N features

Question

Let's say I'm looking for an apartment with a roommate, and I want to train/discover a model of preferences that my roommate can use to evaluate if I'll like a potential apartment without needing my input.

The dataset has some interval features (rent, bedrooms, etc.) and some nominal/categorical features (has_dishwasher, laundry).

data = """
rent,bedrooms,bathrooms,distance_to_work,has_dishwasher,laundry
3695,3,1,21,no,building
4200,4,2,27,yes,building
4500,4,1.5,19,unknown,building
4200,3,1.5,19,no,building
3800,3,1,13,no,unit
4000,3,1,8,no,unknown
4500,3,2,26,yes,building
4050,3,1,20,no,unknown
3800,3,1,13,no,unknown
"""

The preference dataset is generated from pairwise comparisons, such that if there is a path between A and B then A is considered preferable over B. If there is no path between two nodes then they can be treated as ties/incomparable.

I'd like to approximate my preference function for analysis, ideally in a non-black box fashion, so that I can draw conclusions like:

"I value in unit laundry at approximately $100 (plus/minus $10) rent"
"4 rooms are preferred over 3 rooms all things being equal"
"I prefer in unit laundry > building laundry > unknown"
"Adding an additional bathroom is as preferable as reducing distance_to_work by 2"
"It's important for the distance_to_work to be under 20, but once under 20 additional reductions aren't as important" (non-linear?)

What are some approaches that would be appropriate?

I've considered:

Linear regression: I would guess that some of the relationships are non-linear like in the last bullet above. Also I'm not sure how this works with categorical features.
Multi-criteria decision-making methods (MCDM): These often seem to be used in linear programming contexts where as per the above linear relationships seem like they will miss details.
Neural networks: Would probably determine the preference function, but in a black-box fashion such that analysis is difficult.
Elo systems: Calculating Elo then trying to train some classifier seems doable, but I'm not sure if it's the best approach given that the dataset will be small, and just because node 6 is between 9 and 4 doesn't necessarily mean that its score should be halfway between them, which I believe Elo would tend towards.
Ordinal regression/ranking learning: Seems like it would be more appropriate.

any question that is just a problem description and no code is 99% of the time a Homework question. Just try all the things you propose and choose the best. If none is good enough research more to find a method that works, SO also closes questions seeking recommendation. — rioV8, Jul 10 '23 at 09:16

Approximate a function given pairwise comparison data of entities with N features

0 Answers0