This demo considers a well-known dataset of fluorescence measurements for three amino acids.
Website: https://ucphchemometrics.com/2023/05/04/amino-acids-fluorescence-data/
Analogous tutorial in Tensor Toolbox: https://www.tensortoolbox.org/cp_als_doc.html
Relevant papers:
Bro, R, Multi-way Analysis in the Food Industry. Models, Algorithms, and Applications. 1998. Ph.D. Thesis, University of Amsterdam (NL) & Royal Veterinary and Agricultural University (DK).
Kiers, H.A.L. (1998) A three-step algorithm for Candecomp/Parafac analysis of large data sets with multicollinearity, Journal of Chemometrics, 12, 155-171.
using CairoMakie, GCPDecompositions, LinearAlgebra
Load data
The following code downloads the data file, extracts the data, and caches it.
using CacheVariables, MAT, ZipFile
data = @cache "amino-acids-cache/data.bson" let
# Download file
url = "https://ucphchemometrics.com/wp-content/uploads/2023/05/claus.zip"
zipname = download(url, tempname(@__DIR__))
# Extract MAT file
zipfile = ZipFile.Reader(zipname)
matname = tempname(@__DIR__)
write(matname, only(zipfile.files))
close(zipfile)
# Extract data
data = matread(matname)
# Clean up and output data
rm(zipname)
rm(matname)
data
end
Dict{String, Any} with 6 entries: "evalme" => "delta=input(' How many nm above exc. would you like to cut off: ');id=on… "EmAx" => [250.0 251.0 … 449.0 450.0] "X" => [0.858 0.32 … 11.931 12.159; 1.312 1.742 … 2.376 1.716; … ; 1.243 -0.071 … "ExAx" => [240.0 241.0 … 299.0 300.0] "readme" => Any["1.Column Trp in M", "2.Column Tyr in M", "3.Column Phe in M"] "y" => [2.67e-6 0.0 0.0; 0.0 1.33e-5 0.0; … ; 1.58e-6 5.44e-6 0.000355; 8.79e-7 …
The data tensor X
is 5×201×61 and consists of measurements across 5 samples, 201 emissions, and 61 excitations.
X = data["X"]
5×201×61 Array{Float64, 3}: [:, :, 1] = 0.858 0.32 0.863 0.301 -0.16 … 14.291 15.286 14.53 11.931 12.159 1.312 1.742 0.987 -0.166 0.928 2.33 2.23 1.975 2.376 1.716 1.248 -0.319 0.658 0.414 -0.518 2.989 2.938 2.739 2.782 3.264 1.243 -0.071 1.017 -0.253 -1.151 10.961 9.597 8.76 7.894 7.189 3.963 1.425 1.764 1.769 1.107 6.05 6.34 5.011 5.558 4.724 [:, :, 2] = 1.868 1.01 1.904 1.198 0.559 … 14.359 14.087 13.352 13.123 11.265 1.659 1.703 -0.317 0.882 0.91 1.934 2.378 2.221 1.675 1.443 -1.364 1.393 0.814 -0.744 -0.322 3.194 2.536 2.81 2.381 2.026 -0.246 2.12 0.247 0.442 0.157 9.526 9.73 8.261 8.198 7.19 2.635 3.581 2.284 1.63 1.298 5.654 5.92 5.058 5.224 4.642 [:, :, 3] = 3.017 2.368 2.014 0.323 -0.24 … 14.335 14.674 14.09 13.342 13.222 3.014 2.774 2.727 1.206 0.325 2.288 1.786 2.39 1.825 1.912 0.373 -0.129 0.432 0.103 -0.92 3.154 3.12 3.332 2.696 3.175 1.303 0.397 1.933 -1.355 -0.101 9.673 9.766 9.351 8.018 8.948 4.02 3.929 4.573 0.297 1.691 6.076 5.386 6.276 5.264 5.301 ;;; … [:, :, 59] = 0.067 0.035 0.102 0.011 0.074 0.111 … 7.214 7.337 6.813 7.07 6.23 -0.05 -0.025 0.035 -0.093 -0.066 0.037 1.993 1.825 1.794 1.856 1.525 -0.032 -0.008 -0.001 -0.086 0.004 0.078 0.899 1.093 0.974 0.943 0.918 0.077 0.106 0.1 0.018 0.005 0.103 4.614 4.481 4.553 4.214 4.206 0.071 0.096 0.101 0.013 -0.074 0.035 2.899 2.876 2.52 2.911 2.201 [:, :, 60] = 0.104 0.111 0.071 0.236 0.107 0.117 … 6.268 6.174 6.293 5.325 5.7 0.0 0.051 -0.033 0.1 0.008 0.052 1.967 1.784 1.788 1.535 1.537 0.11 0.052 -0.01 0.025 0.102 0.051 1.077 1.168 1.103 0.826 0.823 0.035 0.051 0.07 0.037 0.101 0.121 3.767 3.935 4.128 3.616 3.337 0.103 0.118 0.068 0.103 0.104 0.051 2.428 2.578 2.333 2.64 2.194 [:, :, 61] = 0.111 0.092 -0.037 0.05 0.037 0.058 … 5.26 4.682 4.518 4.678 4.52 0.034 0.033 -0.088 0.017 0.001 -0.008 1.767 1.847 1.494 1.439 1.522 0.091 0.032 -0.03 0.108 0.075 -0.007 1.029 1.047 0.805 0.7 0.775 0.001 0.093 -0.033 0.052 0.032 -0.007 3.476 3.523 3.151 3.204 2.728 0.035 0.101 0.022 0.12 0.102 0.06 1.869 2.142 1.901 1.782 1.993
The emission and excitation wavelengths are:
em_wave = dropdims(data["EmAx"]; dims=1)
201-element Vector{Float64}: 250.0 251.0 252.0 253.0 254.0 255.0 256.0 ⋮ 445.0 446.0 447.0 448.0 449.0 450.0
ex_wave = dropdims(data["ExAx"]; dims=1)
61-element Vector{Float64}: 240.0 241.0 242.0 243.0 244.0 245.0 246.0 ⋮ 295.0 296.0 297.0 298.0 299.0 300.0
Next, we plot the fluorescence landscape for each sample.
with_theme() do
fig = Figure(; size=(800,500))
# Loop through samples
for i in 1:size(X,1)
ax = Axis3(fig[fldmod1(i, 3)...]; title="Sample $i",
xlabel="Emission\nWavelength", xticks=250:100:450,
ylabel="Excitation\nWavelength", yticks=240:30:300,
zlabel=""
)
surface!(ax, em_wave, ex_wave, X[i,:,:])
end
rowgap!(fig.layout, 40)
colgap!(fig.layout, 50)
resize_to_layout!(fig)
fig
end
Run CP Decomposition
Conventional CP decomposition (i.e., with respect to the least-squares loss) can be computed using gcp
with its default arguments.
M = gcp(X, 3)
5×201×61 CPD{Float64, 3, Vector{Float64}, Matrix{Float64}} with 3 components λ weights: 3-element Vector{Float64}: … U[1] factor matrix: 5×3 Matrix{Float64}: … U[2] factor matrix: 201×3 Matrix{Float64}: … U[3] factor matrix: 61×3 Matrix{Float64}: …
Now, we plot the (normalized) factors.
with_theme() do
fig = Figure()
# Plot factors (normalized by max)
for row in 1:ncomponents(M)
barplot(fig[row,1], 1:size(X,1), normalize(M.U[1][:,row], Inf))
lines(fig[row,2], em_wave, normalize(M.U[2][:,row], Inf))
lines(fig[row,3], ex_wave, normalize(M.U[3][:,row], Inf))
end
# Link and hide x axes
linkxaxes!(contents(fig[:,1])...)
linkxaxes!(contents(fig[:,2])...)
linkxaxes!(contents(fig[:,3])...)
hidexdecorations!.(contents(fig[1:2,:]); ticks=false, grid=false)
# Link and hide y axes
linkyaxes!(contents(fig.layout)...)
hideydecorations!.(contents(fig.layout); ticks=false, grid=false)
# Add labels
Label(fig[0,1], "Samples"; tellwidth=false, fontsize=20)
Label(fig[0,2], "Emission"; tellwidth=false, fontsize=20)
Label(fig[0,3], "Excitation"; tellwidth=false, fontsize=20)
fig
end
Built with Julia 1.10.4 and
CacheVariables 0.1.4CairoMakie 0.12.2
GCPDecompositions 0.1.2
MAT 0.10.7
ZipFile 0.10.1