Entering edit mode

21 months ago

srdjanmasirevic2
▴
10

Hello I am trying to calculate correlation coefficient, and I am trying to write a script but it gives me syntax error.

Basically I have some data and I want to see what is the correlation between these data I have.

But I am encountering some python syntax error that I cannot figure out how to fix it.

My code looks like this:

```
%matplotlib inline
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
plt.rcParams['figure.figsize'] = (20.0, 10.0)
#READING data
data = pd.read_csv ('benchmarking.csv')
print (data.shape)
data.head()
#Collecting X and Y
X = data['logAUC'].values
Y = data['RMSD'].values
#Mean X and Y
mean_x = np.mean(X)
mean_y = np.mean(Y)
print (mean_x, mean_y)
#Total number of values
n = len(X)
# Using the formula to calculate b1 and b2
numer = 0
denom = 0
for i in range(m):
numer += (X[i] - mean_x * (Y[i] - mean_y)
denom += (X[i] - mean_x) ** 2
b1 = numer/denom
b0 = mean_y - (b1 * mean_x)
print (b1, b0)
```

This is the error I get:

```
denom += (X[i] - mean_x) ** 2
^
SyntaxError: invalid syntax
```

My input data looks like this:

```
Protein name logAUC RMSD
0 Metaloellastase 47.96 0.61
1 FGF1 23.44 0.72
2 FKBP1A 38.98 1.16
3 UDP 15.45 0.58
4 MDM2 18.91 1.42
```

Your line starting

`numer += ....`

is missing a closing bracket, I think the error is just being misleading as its gone to the next line in search of the closing brace so it looks like the error is with the`denom...`

line.