using K-means for player replacement

A fredrik aursnes use-case

In this article I will explore using an unsupervised clustering algorithm, K-means, to discover football players with similar statistical outputs that could potentially serve as suitable replacements for the player who’ll be the main focus of this use-case: Fredrik Aursnes. 

INTRODUCTION

On the eve of the 25th of May Feyenoord was a day removed from playing one of its biggest matches in recent history: the Europa Conference League final against AS Roma. After two tumultuous decades the club had reestablished itself on the European stage with a very impressive journey covering victories against Slavia Prague, Union Berlin, Partizan Belgrade and French powerhouse Olympique Marseille in the semis. Unfortunately, in a close game with few chances, Feyenoord drew the shortest straw and lost by 1-0.

 

 

Fast forward three months and it has been clear that, although Feyenoord lost that evening, it has won in its aftermath. Luis Sinisterra broke the previous transfer record (previously held by Dirk Kuyt, who was sold to Liverpool for 18 million in 2006) by 7 million, being sold to Leeds United for a record-fee of 25 million. In addition, Malacia, Senesi and very recently Aursnes left as well for a combined 43 million, setting a record for overall income of outgoing transfers as well, with it exceeding 60 million. Subsequently this enabled the club to sign players that would have previously been unattainable, including its most recent signings: Hancko (Sparta Prague/Slovakian international), Paixao (Coritiba) and Bullaude (Godoy Cruz). Signaling a clear shift in recruitment policy, were in the past Feyenoord would often opt for the safer bet with (cheaper and more experienced) Eredivsie signings, it now pursued young exciting talent across major European and South American leagues. The recent departure of Aursnes as well as this shift in recruitment policy will serve as the basis for the use-case of this article. 

USe-case and methodolgy

In recent years it has become clear that the Moneyball approach, one where the recruitment of players is undertaken with a statistical and data-driven mindset, has successfully crossed the Atlantic and worked its way into the European football scene. In this article, with a use-case focusing on finding a replacement for Aursnes, I will explore the options of replacing players in a similar moneyball-esque fashion with a data-driven and unsupervised machine learning based approach. This approach, compared to traditional video scouting, would probably save (a lot) of time, especially with (much) larger datasets. In addition, one could argue that it creates a more objective foundation for defining the profile of a player, but it must be noted that the data should be viewed within the context of league level and the performance of the team the player belongs too.

 

This approach will be as follows: I have downloaded data, from Wyscout.com, on European and South American players that are reasonably attainable for Feyenoord based on its financial range, ambitions and level of play. From this dataset fifteen features (the features themselves are displayed in the player profile graphs) will be drawn which will be clustered in an unsupervised manner using K-means, a machine learning algorithm designed for clustering a dataset into K numbers of clusters in which variance between the features in these groups is minimized using squared Euclidian distances. Below you can find an example visualization of what the results of such a clustering process could look like when using two features (X and Y) and four clusters. The number of clusters itself will be defined using the elbow method, also visualized below. The Elbow method is a popular and intuitive tool for assessing the optimal number of clusters through minimalizing distortion (the sum of squared errors of each cluster point to the cluster center) whilst at the same time preventing overfitting. For more info on the algorithm itself check out K-Means’ Wikipedia page or popular YouTube channels, like Statquest.

Visualization of K-Means output with two features and four clusters

Visualization of the optimal number of K clusters with the Elbow method

RESULTS

After running the K-means algorithm on the data we extract the group of players Aursnes was clustered into and find the players which were deemed most similar to him. The profile of Aursnes (visualized in the graph to the right) can be defined as follows: a holding defensive midfielder excelling in intercepting the ball and, when in possession, serving as a central playmaker with a clear aptitude for being able to receive passes, progress play and driving offensive opportunities with key passes, through balls and second/shot assists.

CARL GUSTAFSSON

AGE 22
TRANSFERMARKT.COM VALUE
1 MILLION

The first player, who was clustered to the same group as Aursnes, we will explore is 22 year old Carl Gustafsson of Swedish side Kalmar FF. His profile regarding possession at first appearance seems fairly similar to that of Aursnes, especially concerning the role of receiving and progressing play with his high percentile scores in received passes per 90, progressive passes per 90 and passes to final third and penalty area per 90. We do however observe a drop in relative performance regarding the aspects of deep completions and key passes per 90. From a defensive point of view Gustafsson is, according to the data, not as adept as intercepting the ball as Aursnes is, with a far lower percentile ranking. Gustafsson’s attacking contribution is, except for xG per 90, also a tad bit lower.

 

 

Overall Gustafsson seems to be matching Aursnes’s output most adequately in the possession side of things. It must however be noted that this data stems from the minutes he’s played in the Swedish Allvenskan. A league that simply put, in overall quality, is a lot less strong compared to the Dutch Eredivsie. As also indicated by the UEFA’s coefficient rankings (a ranking system based on points earned by clubs through winning and drawing matches in European competitions), were the Netherlands is currently ranked 7th and Sweden’s ranked 23rd. 

MAXIME D'ARPINO

AGE 26
TRANSFERMARKT.COM VALUE
2.5 MILLION

Next up is KV Oostende’s French midfielder Maxima D’Arpino, a product of Olympique Lyon’s acclaimed academy. Having joined KV Oostende in 2020 from (then) Ligue 2 side US Orleans, he’s been a regular feature in the starting lineup, with 2880 minutes in 2020-21 season and 2349 in 2021-22. The first thing that stands out, when looking at D’Arpino’s profile, is his attacking output regarding shot and expected assists, with percentile scores of 95 and 96. D’Arpino does excel as well in progressing the ball, with very high percentile scores concerning progressive passes per 90, passes to final third per 90 and passes to penalty area per 90. However, we do witness a drop in output across the other possession features, most notably received passes per 90, progressive runs per 90 and to a lesser degree key passes and deep completions per 90. On the defensive side of things, he scores well on Padj interceptions and especially Padj sliding tackles, but clearly is relatively much less aerially dominant when compared to his peers/Aursnes.

 

 

Although KV Oostende itself is a team that does not belong to Belgium’s elite, the Jupiler Pro League is probably much closer in quality to the Eredivisie compared to the Swedish Allvenskan, with it being ranked 13th in the UEFA coefficient rankings.

JOE BELL

AGE 23
TRANSFERMARKT.COM VALUE
1.5 MILLION

Moving on to Danish side Brøndby we explore the next player deemed to be similar to Aursnes, namely: Joe Bell. The British born New-Zealand raised player joined Brøndby in 2021 from the Norwegian team Viking Stavanger. Bell as well excels in chance creation, with percentile scores of 88 or higher in second, shot and expected assists per 90. His expected goals per 90 is a lot lower, reinforcing the image of him as primary an shot creator for his teammates, instead of someone who appears a lot in the box himself. Defensively he excels at getting into and winning duels, as well as intercepting the ball, but lacks aerial prowess. In possession he has a fairly all-round profile, with only two features ranking below the 73th percentile.

 

 

His overall profile, although in itself interesting (especially with his high offensive output), does seem to be less similar to that of Aursnes then some of the other players discussed in this article. In addition, as was the case with Gustafsson (albeit probably to a lesser degree with Denmark being ranked higher on the UEFA rankings), applies here as well regarding the overall level of competition he’s facing since Denmark is ranked substantially lower than the Netherlands (7th vs 18th).

MATTHIAS BRAUNÖDER

AGE 20
TRANSFERMARKT.COM VALUE
3 MILLION

Austria Wien’s Braunöder is the next player who was clustered to the same group as Aursnes. However, when looking at his output, the similarity does seem to be less pronounced than with others in this sample. His greater offensive output, with high scores in the assisting aspects, but lower regarding receiving and progressing play seems to hint at a more offensive role within the midfield of Austria Wien compared to that of Aursnes at Feyenoord. His defensive rankings do hint at stronger defensive dueling capabilities, but also lesser ones in the area of intercepting the ball. His profile in itself does look interesting, although perhaps belonging more to the role of a box-to-box midfielder.

 

One major plus for Braunöder’s case is the competition in which he plays, the Austrian Bundesliga. It is ranked 8th, one place below the Netherlands, and hosts teams that regularly perform (very) well in Europe with BSC Young Boys, LASK Linz and Red Bull Salzburg. Moreover, Feyenoord’s most recent signing, Gernot Trauner, transitioned very successfully from the Bundesliga to the Eredivsie. With an estimated value of 3 million it is likely that, from a financial standpoint, signing him could be feasible as well. Although that would also depend on Austria Wien’s willingness to, just a bit more than a day before the closing of the transfer deadline, part ways with what appears to be one of their more valuable players.

SIVERT MANNSVERK

AGE 20
TRANSFERMARKT.COM VALUE
1 MILLION

The last player clustered to the same group as Aursnes who I’ll touch upon is someone who will already be quite familiar to most Feyenoord fans: Sivert Mannsverk. There have already been reports from reliable news outlet 1908.nl that Feyenoord’s been closely following the young Norwegian, who’s playing for the same side as Aursnes was before joining Feyenoord (Molde FK) a year ago. Therefore, it is very interesting that, when applying the clustering algorithm, he too is deemed a player with similar output as Aursnes.

 

His profile does share some distinct similarities to Aursnes, including the high percentile scores in the features of received passes, deep completions and second, shot and expected assists per 90. Defensively he is not as skilled in intercepting the ball as Aursnes, but he does seem to be an adept sliding tackler and aerial force. In addition, his expected goals score also places him around the 90th rank in percentiles. He does however not share Aursnes broader possession profile, with relatively low scores in progressive passes and key passes per 90 and substantially lower scores in passes to the final third and penalty area as well.

 

 

Norwegian football itself has recently been experiencing a impressive rise through the European coefficient rankings, with teams like Molde FK and Bodø/Glimt reaching knock-out stages of the continental leagues. With this apparent increase in overall quality as well as the success of last year’s signings (Pedersen and Aursnes himself) it appears that the Eliteserien and specifically Molde would be an adequate recruiting ground for Feyenoord. 

CONCLUSION

After taking a look at the different players clustered to the same group as Aursnes we find that all share similar profiles, but none stands out as perfect 1:1 replacement. D’Arpino and Bell are too, albeit not as profound, adept at intercepting the ball. They too excel in several possession-based features, but not as all-round as is the case with Aursnes.

 

The same, regarding possession features, applies to Mannsverk, Braunöder, and Gustafsson. Concerning the offensive output D’Arpino, Bell, Mannsverk and Braunöder all appear to, especially with regards to assists, outperform Aursnes. It does however have to be noted here that Bell and Mannsverk play in leagues with relatively lower levels of resistance.

 

The goal of this article was to explore the feasibility of exploring (large) player datasets for adequate replacements in a relative short amount of time (compared to traditional video scouting) with an objective point of reference. Whether this was successful or not, I’ll let for the reader to decide. It’d be great if you could drop a comment on Twitter/LinkedIn to share your thoughts on this article and if you have any questions or feedback, don’t hesitate to reach out through the contact form on this website.

 

Thank you for reading!

 

Cheers,

 

Daan