The comparison of different algorithms for insurance pricing exercise is a task that relies heavily on the data sample used. There are two options: real data and synthetic data. A critical issue with the real data is the lack of information of the exact underlying rate that we want to predict. That makes the comparison of different algorithms on real data less clear. We decide for an analysis on synthetic data to avoid this issue. To make data sample more realistic, we define non-linear dependency between individual parameters and frequency and severity distribution of claims and determine complex dependency between the parameters. Resulting synthetic data consists of policy data with several features and claim amounts on each policy, which are set to zero if there was no claim. Since we know the underlying frequency and severity distributions for each simulated policy, we produce an expected claim rate for each policy. Based on this sample, several different machine learning algorithms are calibrated to estimate the appropriate basic premium rate to cover expected claims: GLM, GAM, Random forest, XGBoost, Light GAM and Neural networks. We compare the predictions with the true underlying rates, try to find the best fit, exploit the strengths and weaknesses of each algorithm. Finally, we find remarkable results evaluating the effects to market share and profit of the insurers that would use a particular algorithm in comparison with others.