{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "ce0ab035-e7b5-4b8c-a4a1-5d84335e0619",
   "metadata": {},
   "source": [
    "# Exporting a DataFrame from a list of sequences"
   ]
  },
  {
   "cell_type": "raw",
   "id": "5f10484e-218f-4d1b-aa9d-7093d38d8cb1",
   "metadata": {
    "raw_mimetype": "text/restructuredtext",
    "tags": []
   },
   "source": [
    ":py:class:`~thebeat.core.Sequence` objects can be quickly converted to a :class:`pandas.DataFrame` using the :py:func:`thebeat.utils.get_ioi_df` function. Below we illustrate how it works.\n",
    "\n",
    "The resulting :class:`pandas.DataFrame` is in the `tidy data <https://en.wikipedia.org/wiki/Tidy_data>`_ format, i.e. 'long format', where each individual IOI has its own row."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "bbfed0df-5df1-40a8-b34b-4e077481b112",
   "metadata": {},
   "source": [
    "## Preliminaries\n",
    "First we import some necessary functions and create a NumPy generator object with a seed, so you will get the same output as we. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "id": "aa63d2d0-25ed-4579-adea-5c8549f4774e",
   "metadata": {},
   "outputs": [],
   "source": [
    "import thebeat\n",
    "import numpy as np\n",
    "import pandas as pd\n",
    "\n",
    "rng = np.random.default_rng(seed=123)"
   ]
  },
  {
   "cell_type": "raw",
   "id": "25ec3e54-1dfb-4a2c-8f2b-bed528acfd50",
   "metadata": {
    "raw_mimetype": "text/restructuredtext",
    "tags": []
   },
   "source": [
    "And we create some random :py:class:`~thebeat.core.Sequence` objects:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "id": "57ce37f6-2053-4c63-b906-a996ce3f30d2",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Create empty list\n",
    "sequences = []\n",
    "\n",
    "# Create 10 random Sequence objects and add to the list\n",
    "for _ in range(10):\n",
    "    sequence = thebeat.Sequence.generate_random_normal(n_events=10,\n",
    "                                                       mu=500,\n",
    "                                                       sigma=25,\n",
    "                                                       rng=rng)\n",
    "    sequences.append(sequence)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2d310fba-68af-409e-85e2-d26cea745e30",
   "metadata": {},
   "source": [
    "See what they look like:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "id": "32e91aa0-db01-42bd-b940-e579258b8848",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[Sequence(iois=[475.27 490.81 532.2  ... 484.09 513.55 492.09]), Sequence(iois=[491.94 502.43 461.85 ... 503.41 538.3  483.5 ]), Sequence(iois=[492.21 508.44 444.81 ... 518.87 496.35 532.05]), Sequence(iois=[526.85 509.82 500.13 ... 445.7  490.75 504.11]), Sequence(iois=[521.5  544.04 524.83 ... 535.75 496.09 483.16]), Sequence(iois=[484.02 498.47 490.18 ... 500.7  500.71 501.38]), Sequence(iois=[487.96 485.41 478.45 ... 486.42 486.03 492.09]), Sequence(iois=[488.48 464.09 534.13 ... 489.04 494.71 509.1 ]), Sequence(iois=[523.82 537.99 542.6  ... 503.21 481.64 484.49]), Sequence(iois=[520.33 541.05 494.34 ... 493.18 510.56 497.97])]\n"
     ]
    }
   ],
   "source": [
    "print(sequences)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0b16ae08-743e-4ea8-a278-2e2e2316dd7e",
   "metadata": {},
   "source": [
    "## Creating a simple DataFrame of IOIs"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "id": "e6bb47d8-16ff-42db-9aed-0bf5700704a1",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>sequence_i</th>\n",
       "      <th>ioi_i</th>\n",
       "      <th>ioi</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>475.271966</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>490.805334</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>0</td>\n",
       "      <td>2</td>\n",
       "      <td>532.198132</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>0</td>\n",
       "      <td>3</td>\n",
       "      <td>504.849360</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>0</td>\n",
       "      <td>4</td>\n",
       "      <td>523.005772</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>9</td>\n",
       "      <td>4</td>\n",
       "      <td>492.915720</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>9</td>\n",
       "      <td>5</td>\n",
       "      <td>475.121716</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>9</td>\n",
       "      <td>6</td>\n",
       "      <td>493.178206</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>9</td>\n",
       "      <td>7</td>\n",
       "      <td>510.561104</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>9</td>\n",
       "      <td>8</td>\n",
       "      <td>497.966426</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>90 rows × 3 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "    sequence_i  ioi_i         ioi\n",
       "0            0      0  475.271966\n",
       "1            0      1  490.805334\n",
       "2            0      2  532.198132\n",
       "3            0      3  504.849360\n",
       "4            0      4  523.005772\n",
       "..         ...    ...         ...\n",
       "4            9      4  492.915720\n",
       "5            9      5  475.121716\n",
       "6            9      6  493.178206\n",
       "7            9      7  510.561104\n",
       "8            9      8  497.966426\n",
       "\n",
       "[90 rows x 3 columns]"
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df = thebeat.utils.get_ioi_df(sequences)\n",
    "df"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7b267cc7-4554-4bec-a45c-a7a0bdffbf83",
   "metadata": {},
   "source": [
    "## Creating a DataFrame with additional calculations"
   ]
  },
  {
   "cell_type": "raw",
   "id": "9f44f62c-4c65-4ecb-b77a-03df5b0c1ee9",
   "metadata": {
    "raw_mimetype": "text/restructuredtext",
    "tags": []
   },
   "source": [
    "We can pass :py:func:`~thebeat.utils.get_ioi_df` additional functions to its ``additional_functions`` parameter. They will be applied to each provided sequence, and the resulting values will be added to the :class:`pandas.DataFrame`. Note that we need to provide the actual functions, not simply their names. Also note that ``additional_functions`` expects a list, so even if providing only one function, that function needs to be within a list."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "id": "a06e1976-3ad2-427f-b98b-5cf2b3ad2360",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>sequence_i</th>\n",
       "      <th>mean</th>\n",
       "      <th>std</th>\n",
       "      <th>amin</th>\n",
       "      <th>amax</th>\n",
       "      <th>ioi_i</th>\n",
       "      <th>ioi</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>0</td>\n",
       "      <td>503.364499</td>\n",
       "      <td>17.923263</td>\n",
       "      <td>475.271966</td>\n",
       "      <td>532.198132</td>\n",
       "      <td>0</td>\n",
       "      <td>475.271966</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>0</td>\n",
       "      <td>503.364499</td>\n",
       "      <td>17.923263</td>\n",
       "      <td>475.271966</td>\n",
       "      <td>532.198132</td>\n",
       "      <td>1</td>\n",
       "      <td>490.805334</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>0</td>\n",
       "      <td>503.364499</td>\n",
       "      <td>17.923263</td>\n",
       "      <td>475.271966</td>\n",
       "      <td>532.198132</td>\n",
       "      <td>2</td>\n",
       "      <td>532.198132</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>0</td>\n",
       "      <td>503.364499</td>\n",
       "      <td>17.923263</td>\n",
       "      <td>475.271966</td>\n",
       "      <td>532.198132</td>\n",
       "      <td>3</td>\n",
       "      <td>504.849360</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>0</td>\n",
       "      <td>503.364499</td>\n",
       "      <td>17.923263</td>\n",
       "      <td>475.271966</td>\n",
       "      <td>532.198132</td>\n",
       "      <td>4</td>\n",
       "      <td>523.005772</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>9</td>\n",
       "      <td>501.028710</td>\n",
       "      <td>18.898427</td>\n",
       "      <td>475.121716</td>\n",
       "      <td>541.045025</td>\n",
       "      <td>4</td>\n",
       "      <td>492.915720</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>9</td>\n",
       "      <td>501.028710</td>\n",
       "      <td>18.898427</td>\n",
       "      <td>475.121716</td>\n",
       "      <td>541.045025</td>\n",
       "      <td>5</td>\n",
       "      <td>475.121716</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>9</td>\n",
       "      <td>501.028710</td>\n",
       "      <td>18.898427</td>\n",
       "      <td>475.121716</td>\n",
       "      <td>541.045025</td>\n",
       "      <td>6</td>\n",
       "      <td>493.178206</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>9</td>\n",
       "      <td>501.028710</td>\n",
       "      <td>18.898427</td>\n",
       "      <td>475.121716</td>\n",
       "      <td>541.045025</td>\n",
       "      <td>7</td>\n",
       "      <td>510.561104</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>9</td>\n",
       "      <td>501.028710</td>\n",
       "      <td>18.898427</td>\n",
       "      <td>475.121716</td>\n",
       "      <td>541.045025</td>\n",
       "      <td>8</td>\n",
       "      <td>497.966426</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>90 rows × 7 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "    sequence_i        mean        std        amin        amax  ioi_i  \\\n",
       "0            0  503.364499  17.923263  475.271966  532.198132      0   \n",
       "1            0  503.364499  17.923263  475.271966  532.198132      1   \n",
       "2            0  503.364499  17.923263  475.271966  532.198132      2   \n",
       "3            0  503.364499  17.923263  475.271966  532.198132      3   \n",
       "4            0  503.364499  17.923263  475.271966  532.198132      4   \n",
       "..         ...         ...        ...         ...         ...    ...   \n",
       "4            9  501.028710  18.898427  475.121716  541.045025      4   \n",
       "5            9  501.028710  18.898427  475.121716  541.045025      5   \n",
       "6            9  501.028710  18.898427  475.121716  541.045025      6   \n",
       "7            9  501.028710  18.898427  475.121716  541.045025      7   \n",
       "8            9  501.028710  18.898427  475.121716  541.045025      8   \n",
       "\n",
       "           ioi  \n",
       "0   475.271966  \n",
       "1   490.805334  \n",
       "2   532.198132  \n",
       "3   504.849360  \n",
       "4   523.005772  \n",
       "..         ...  \n",
       "4   492.915720  \n",
       "5   475.121716  \n",
       "6   493.178206  \n",
       "7   510.561104  \n",
       "8   497.966426  \n",
       "\n",
       "[90 rows x 7 columns]"
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df = thebeat.utils.get_ioi_df(sequences=sequences,\n",
    "                              additional_functions=[np.mean, np.std, np.min, np.max])\n",
    "df"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2a6f6828-0903-4baf-bf3f-c989d4a4f6d8",
   "metadata": {},
   "source": [
    "## Saving the DataFrame"
   ]
  },
  {
   "cell_type": "raw",
   "id": "9379c828-52d3-40ac-87f2-54e458530111",
   "metadata": {
    "raw_mimetype": "text/restructuredtext",
    "tags": []
   },
   "source": [
    "To save the :class:`pandas.DataFrame`, we can simply use its :meth:`pandas.DataFrame.to_csv` method:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "id": "ef8d8b5c-ccf2-48c7-b566-14933a0ff594",
   "metadata": {},
   "outputs": [],
   "source": [
    "df.to_csv('random_sequences.csv')"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.16"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}