Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/bssh icav2 fastq copy manager #144

Merged
merged 12 commits into from
May 1, 2024

Conversation

alexiswl
Copy link
Member

@alexiswl alexiswl commented Mar 10, 2024

Summary

The bssh manager runs performs the following logic:

Reads the bssh_output.json file from the BCLConvert run output folder

Collects the basespace run id from the bssh output json

Collects the run id from the run info xml

Collects the fastq list csv file from the BCLConvert run output folder

Read and return the fastq list rows as a list of dictionaries (gzip compressed / b64 encoded)

Collects the samplesheet path from the BCLConvert run output folder (as we might need this to go into the cttso directories)

Read and return the samplesheet as a dictionary (via the v2-samplesheet-maker package) (gzip compressed / b64 encoded)

Return a manifest of icav2 uris where each key represents a ICAv2 data uri (file) and the value is a list of icav2 uris where each uri is a destination for a given file.

Dependent PRs

TODO List

- [ ] Implement stateful service to track fastq copy For another day

  • Implement event service to raise event when all data has been copied over

@alexiswl alexiswl self-assigned this Mar 10, 2024
@alexiswl alexiswl added the feature New feature label Mar 10, 2024
@alexiswl alexiswl linked an issue Mar 11, 2024 that may be closed by this pull request
Copy link
Member

@victorskl victorskl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, too.!

@victorskl victorskl self-requested a review April 19, 2024 00:11
@victorskl victorskl marked this pull request as draft April 19, 2024 00:14
@victorskl
Copy link
Member

Put this branch into draft mode until we finalise

@alexiswl alexiswl force-pushed the feature/bssh-icav2-fastq-copy-manager branch from 3c72762 to 6426d62 Compare April 28, 2024 10:06
@alexiswl alexiswl marked this pull request as ready for review April 29, 2024 00:21
@victorskl
Copy link
Member

reviewing.. but we merge this after #254 fyi

@alexiswl
Copy link
Member Author

Yes will also need to update the snake_case attributes to camelCase

Copy link
Member

@victorskl victorskl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM all in all. Minor comments. Let us see this first-cut prototype work.

I skip TypeScript case things as we discussed in #186 convo for refactor PR, later.

this,
'icav2_copy_batch_state_machine',
props.icav2_copy_batch_utility_state_machine_name
);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(No blocker / no action need for this PR. Just something to tinker down track.)

  • I think, this makes dependency on ICAv2CopyBatchUtilityStack (deployment aspect) and, service that stack creates (app aspect) - this is okay, expected.
  • Wonder, whether we could do CloudMap service discovery with Step Function as a Step and, handle when/what-if upstream service go down/no-avail scenario.
  • This would also defer deployment stack dependency to application service dependency at runtime.

See discussion and reference pointers:
https://umccr.slack.com/archives/C03ABJTSN7J/p1714381660323079

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively.

We should send manifest_b64gz back into MainBus channel; let ICAv2CopyBatchUtility service pick it up and, do its job.

That'd create an event schema, say, ICAv2CopyBatchUtilityRequest for example.

In this case, we just have to watch out on EventBridge event message size and number of characters limit.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think, this makes dependency on ICAv2CopyBatchUtilityStack (deployment aspect) and, service that stack creates (app aspect) - this is okay, expected.

Just using the name as defined in the constants, rather than relying on an SSM parameter path, but yes the lookup based on the state machine name will fail if it doesn't exist. Not sure if this is a bad thing given the dependency.

We should send manifest_b64gz back into MainBus channel; let ICAv2CopyBatchUtility service pick it up and, do its job.

Hmm, depends how far down the events rabbit hole we want to go. This might work in this case because the bssh fastq copy manager's last step is calling the copy batch utility. Completion of the copy batch utility could then raise an internal event that lets bssh fastq copy know it can now raise its own completion event.

However, we use this same service for the cttsov2 pipeline. It takes about 3 seconds for the tso fastqs to transfer to their new directory, the analysis is then started after the fastq transfers are complete. If we went with an event based structure for this, the cttso step function would need to submit the event, pause the workflow and then resume for the analysis submission once it hears back from the copy batch utility service. I get the desire to go with the event-based architecture, but I don't think it completely decouples things.

In hindsight, the icav2 copy batch utility service could probably have been part of the infrastructure repository and a static service that can be called on, like the credentials rotation stack

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We learn...

I can throw another option into the mix: create your own (internal) ICAv2CopyBatchUtility exclusively for this stack's purpose. No dependency on existing/external stacks.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😭

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HeHe, not saying we should refactor. Just keeping in mind that not all duplication is bad... it can have it's advantages.

# # "fastq_list_rows_b64gz": "H4sIADwI6GUC/+Wd32/cNgzH/5Ugz7XPkmzL7pvrYlqB9KV2hwHDYPh+ZAiWXrtr0qEb9r+PVHa6NWV1idIXHtGH8pIDii/Zj0VRFP3L3+dv3KuX58/Pzl3Xj65zY+fyHsxu7Luuz9X5szP4yvAav3KhTaFM3dz97OIF/uzt9vft+z+3F1fL3bz7jL+5mLcb+I3CL23mtfrh6nrz9s2rYbfC71+t5k/6+WKx1OZyWak6WzeVzsq2qbJ5vVpljVErpTfNaq314ur63Tabt/P154+bjwttlFL11BWqqPRUKKunFz/+dPG6ejn8bKdLpeaynYxqL22ZvVhd9++3nza7m7NP5aQnm1lt5rJcVple6k1WFo3NmqWuslav1rOZVVWu28X725sPtzeLYX734Rr+RZQyqcVedzCmQU0XRaGmN2qCv/LL+ePNH/lvf53/p1mftGZNat7H+f+CbdVWm+ZSZ0avTVaubJ3NRpfZqp6baj1vGrusvODpw+7qHfz/WehCm7johamrtq4a/GpZ6MLOy9X69vZq/d2ixkHBvRj88+zsQDJA3HeAs+vyEUwHXLuRJLkVSnIbjGnQQkimNPMi+fFR46AgRnLfw3oMDI993gHSo/9EkWwLmSTbIhjTYGSQTGpmRXJC1DgoiJHs/KIMybXLOwdLMqTYdHZtlVCSD8Y0lEJIpjTzIvnxUeOgILomjyPiO45d3oMNK7TryOzaaqEk62BMQyWEZEozL5IfHzUOCuJrMi7FDhdiX/eCTz2dXRuhJJtgTEMthGRKMy+SHx81DgriFa+7khesyVi4HkdfxtZfk1zW026zu90+iGd9KjzvH4x79fc+ToNFH+uTZfvB+llw/p2iyU1NNCcH/P32esTd9eiPrzqSfyuaf/ulj+3ex40Q/o/p58X/06LJTU10/Yd9OOKPZ9eQzkMqAFk9yX8jmv/mSx83ex+3Qvg/pp8X/0+LJjc18Z187zD9h0Uf0McHAewASP5b0fy3X/q43ftYFUIeAEcdwOsJ8MR4spMTzQE6yAB6PFvDs3I8LIenAvUMqArJz4C9+nsfwclKxjPguANYPQOeGk92cmLPAL8LGH0Z3x/M4SfyGWBLmfTbMhjgUi2DeFo0K8pT4sZCQryq5yCjxxO63Bf0sMhHZvW2EkpzFQxwqRFCMymaF80JcWMh4UgHHN4o6XJclbFGT5HcCCW5qYIB7ixlkEyLZkVyStxYSIjvtLHU5oBkvFgCiTZJshVKsg0GuLMSQjIpmhfJCXFjISGaYWOZDDLs3PlOuLHLDUFy/SCSzamQvG8+aupggDt9G5I5WZLjolmQ/JS4sZAQza5H/APbY4c5dvcNkhuhJDfBAHdaISSTonmRnBA3FhKiNWzfk+7GHLbIeKrtSJJboSS3wQB3NkJIJkXzIjkhbiwkxLNrbETp+rzHOjYszBTJbSGT5LYIBrizlUEyLZoVySlxYyEhXrse/UQk7C73jeYkyUooySoY06ALISSTonmRnBA3FhLiN8V6PxEpxxJ2D7k2SbIWSrIOBrhTCSGZFM2L5IS4sZBwpDukw5GFvtML69gkyUYoyQcD3KmFkEyK5kVyQtxYSIhWvLBjEy9sj3dt23R2XQkluQoGuNMIIZkUzYvkhLixkHBkXqFv2sR+TefT7PJrkvXD9snlqZC873vVKhjgTt9oU54syXHRLEh+StxYSIivyVi87nvMrjtf+qJI1kJJ1sEAd1ZCSCZF8yI5IW4sJERr1zjYH7Lq3A9JgVWZJNkIJdkEA9xZCyGZFM2L5IS4sZBwpO8aLzjijLO7aUckyaVQkg8GuNMKIZkUzYvkhLixkBCfV4RjikaXdz2OH/3GPrkSSnIVDHBnI4RkUjQvkhPixkJCNLv2w0bxVuP+RR0UybVQkutggDtbISSTonmRnBA3FhLi3Zp+dCjuk3vPMkmyFUqyDcY0mEIIyaRoXiQnxI2FhPgNCuzXdHezwP14f4rkRijJTTDAnUoIyaRoXiQnxI2FhCNvssS5/tjjhcm1o2vXrVCS22CAO7UQkknRvEhOiBsLCUcm9OLrpbscK9e+iE2QbAqZJJsiGOBOI4NkWjQrklPixkJCfMamn951N2mg/8Z5shHa42VUMMCdQnq8aNG8SE6IGwsJ8TdZ4l0of6vR+X4vkmShPV5GBwPcKaTHixbNi+SEuLGQEF2T/ZyBzuFEvs6/BosiWWiPlzHBAHcK6fGiRfMiOSFuLCQcudXo3yiN77Bw/mUWFMlCe7zMwQB3CunxokXzIjkhbiwkxHu8nL/PmGOnpuvo82QjtMfLVMEAdwrp8aJF8yI5IW4sJMT7rscepw3kft4AtogQJJdC1+TyYIA7hfR40aJZkZwSNxYSHjdbs0yerXlyJBMDDk+fZFo0K5JT4sZCwuMm8pXJE/lOjmRiLNrpk0yLZkVyStxYSDhSu3b+BgWeKjt8QRRFstDsuj0Y01BKIZkUzYvkhLixkHCf5F//BQXTEuJCoAAA", # pragma: allowlist secret
# # "samplesheet_dict_b64gz": "H4sIADwI6GUC/9VabW/bNhD+K4P2dSn0/pJ+0lRAKNBlQ8N+KIbiINtyI9SWXVlJVxT57+NRZkxRIi0TwlYjaNCIEp87kXruOd79sB7KYlU21u0vP6x1tSlhvWu2RQtPZXOodjW97v72i9U81lAX25L+aZHD1/rmrly8v0tdz3Ec7+aPd2V5T96DE97tnlzPog9U9aFtHrdl3UL7fc+eo2PFffnVesb5KOiBYeL/wIHl9+WmxCtO4BzHwZWvVvWq/Kd3s/1yVbzZRojFcrPc1dSNFg5l21b15w5wW9XV9nELbVNtt+UKGNSmrD+3D3TcC+ij/JZiVezbsoEdnWVT7HEYR4vDFzg87OjE3A321GG3br8VTSm8Ost/5b6KLMmcVdEWdPDvH9amqPHVoG+HYrunb79a4WPvXM92vDC2uHt4MU8zkqc5SfOXyy5ez+j1lGRpmuF1tLWpVuXpdVgfojvno+N7r986NvvHL+D93MduHdh8JCcZ/aGTpnRuOjUZ3OiqbqSunnUr6blFPcpS6lue9t3CyXMKkZPrcCuye25lGV0pOhPJ+m7RCXPChq7ELUfahLhcdA9KmzDN6WLhtFeyCSO3v1qEoPl0FunbogN0IfP0WjahJ60Wrkt+XBVxtegK0qHsZ9uE7rhbfkjnoDFIIo6OOeQ1QzIkhPPkFTgXjTiXsanoRGTwnRHG9+mVOBePrRz9qNC3QSSjE9MVpfv1SpxLRpxjoSwfUgn1C10mP108UzgX2GMrx55Hrh9ENQxr+U9H/wrnIr/nFtuOhPOhQCXI/Th0LW4FEolQ/ZFiAJC+M5wWAa5kK8aBrEFQ7UrfV86mU27AjzSFeP02vnO7X/jnJOhI2v/4ZctSFXc+6gRDaE8BHfYXEz+wVN6hOQvwymhgCi1lHgR/CJGh8aufHbqfHRCmX3Iih/gMSXbmF55ICp7RtSydMvykqOszQ8sqm7CcbhD7mSaYGdqVVFXG8i55ren+Jnk2M7Qn01WKOe6Qg3HrzwzdpxRG/oPMmnRBzfSF++PQriMnioynBxSd82WYEdqVvMYtnmWDF57yzTcjtJSZ4GkGkbNIpnpzYyJVQfsyh2O8H2QNnd6eGTqQhC9KXHmHU1JH4Tv3WvfDR85yokHQFM4oZoSOJCJlWctgrTOiUcOm0LEUPpBK85EMMc/Sub0enGthejpgM3zf+cw73LPlFASPCuWzDMLSq5l3uOdISpqJsBGpkM3+XXvyGQ5GrmHQzAmZ/bv2+mzWCQV5h6Ne4xn8jNC+HDTZAdXwlJFnaTNCy2yWswgpH0ezs6WZvfZlDqdMmg+8Rm5Vn/sYQv8XitT//2ShCtqXd3g+DB/4qedpdtHH9Yneu9zsHlf9ssznsi6boi1XYvnEfkV/rJcHvu2aL+vN7hsOVcvi5W9gr0mosOyrfbmpmHvWY1PfVpst/bUsbvnA7WKdeItgub5ZLtzoxl+ti5vYD8ubYrkKy4UdO8Wy+PX35SY7zvjkgwtdJaez5VTEUUq8TbVoiub7S+GMjwGXfCCqPH73vin38KVqxXrbPXnfrZlS0ymwXOAaD0RZZ4zlabA84KIORB1njOVrsHzgKg5E4WaMFWiwAuCyDUSlZowVarBC4DoNRGlmjBVpsCLgwgxELWaMFWuwYuBKDETxZYyVaLAS4NILRLVlitVpq3EszwautUCUV8ZYGt7wHODiCkQ9ZYyl4Q3PBa6mQBRQxlga3vA84PIJRMVkjKXhDc8HrpdAlEjGWBre8JA3OoEEoiYyxfI1fvnIh50iAlEEGWMJ9TYF4vEOOFXhoF94U2IvW3L/55OrBo/Ogh/vgFOVDPqFMXPw+Cx4zD1/qWJBv3BlDp6cBT/eAacqE/QLS8bgYr1nHJzfAacqEPQLP8bgoSZ4hDGcWl+g3+1iDqiJICGLIMemFOj3oRgDRpowEtlwaheBfoeIOaAmlkQOnBo5oN+7YQ6oCSgRBhTeYgH9rgpzQE1UiTw4NT9Av9/BHFBDwZEPpyoi9AuH5oCa+BIFcKrvQb+kZwwYawDjAHjlDcRimxKs6xFUY2mEcBwCr3eBWOIyxtII4TgCXtYDsZJnjKXhsjgGXkwDsX5mjKWhsTgBflYB4vGEThTcFfVOCZZoKCyxgdfLQCyRmYNp6CtB+uqOQkA8/TAH01BX4gKviYFYBjMH09BW4gGvgoFY+DIH01BW4gM/1gHxJMccTMMeSQC8yAZiXe08GB4YtYddYNub/pHRzI0Iwo2L8qF4qnaP2BZtYaOwJXQF6xqHFa3BF7VYPAv+ak+ZRJV+HOa91m/uupU89TZc2DZXRdD1VXeAH978Zdt2d94ejI5MUfPTjZzW/jZqpKc00puk+i94k5Pa2EaN9JVG+pOyg+lGTmtHGzUyUBoZTMoiLniTk9rKRo0MlUaGZ7INrXnO5Z33o+YlSvOSM7nJdPOmddCPmefYKvPoiD6TmW7etE74UfMcpXnOmbznksWd0tE+ap6SD51zWdIFb29SZ/qoeUomdLwzOdUlb29Kh/moeUoOdPwzGdh0YpnW2DlqnpL9nOBMvnZJmJvSoDlqnpL36Mjzp+d/AUNtC2LYNQAA", # pragma: allowlist secret
# # "manifest_b64gz": "H4sIADwI6GUC/+2cX2/bNhTFv0rR58kSL/W3b03argESpI3SdcAwELRE29pkSZPoLNmw7z7GrZlmy9Y0AWORuk85BgLk8nePDi8lOX8+rwp+AS98fw50MY9I7JVpBF6YpZHHy6LwUkoKAiItSgC/qteNxxteXw1i8IESQmL2MiBBBCwgCbCDtz8cn0Sv8h8TtiCEhxmjJFskoXdQ1IdtcyF6+ewiZMASLwHKw3AeeTAH4YVBmnjpHCIvg6LklJMoLDO/3chuI/3jdjn4H3nfVM1ymNXt8vmLZz/p0pMoi0S6AI9CSb2wSGKPUwi9IuZpVHKRJvNoWzrr+mrN+ysfAqD/X75P4yiLo/T6V8MAgoTPi3KzqcpPtTz/+btnFqE7ahYtYvtmbK/7vu3Rb98O7g0f5G+H7bqrhRQzeSmR3335nYmu7eXgvxLrTS0rRfCS5ZLLYVYMF3vEuKvLPpIvS95J0bMTIfuqQI4P5fh+w+tKXiHHR3I8akp1Tb9tu05NM+yw3TR4cT8YZs6vN5l8JYREhg9kuLjerVldDYjwsXvM4VVRC0zIR9I82zTbE8vlukZ+j5wdzytlSBwgH71hf76mTzdyNq8a5PgAjudtxz40vzbt7w074H3RlgJN+VCY/fbnbCUxIx91mtnmI27YD4P5afoe/GPeCEb8Y6ABocmNYHnIjoOAsDPC1I/ZdtKcLf/YI+T/Ktl26IEWLKefocO4oQdOQicI/enjBTBezEIHLVge2eF0cBI6IHSj0ONUC5YTK5yuSrYdeqYFy8EO6JmT0AGhP328AMaLIegfmlJI0a+rRpS3PrA8GHnM3Crd5QYANsBYA3YTZBizXvSb5h8fWZ5cNwHGehX8q3zX2wDYhqc4WFEtWB7bcZql1ns/u+39bOd9EtiRQZkjGZTc7kOy60NqRxsSN9qQRFqoS4BawV7V7CR1QOpPv98C7rdGrZ5GWiirh1YETBo5v7sC7q6jGDYBh829hRBgCJmlnmihqEd2RH9ifeaktzMn3WVOZsfBKnUkcu40P6D5jVDfTfZpqoWivr2dScfq+C9qdpI6IPX9Bj1g0D+F+WMtlPljOyIndpM6IHVDkfO1B+gjnivdeIB+jwYANsD4phsFtzbd3UeVQcSK45Wu3/YnKKEWij3Y8dwqdN/9gO7fm/sB3W905syIFiyHwIpJPyPWU4+0UNSpHdQjN6kDUt9DwgAmjFnqgRZqN83sSJjATeqA1I1Q3w2MQLRQCbN9Eh6O1etf1Gyt179222bEQePCbZt7NQCwAXvKHcDcMUsdtFDUIzvSHtykDkjdLHWqhaIe2+F16iZ1QOpmqd8IRT2xw+uhm9QBqZulHmmhqKd2eN36O5FppgXLSWrHW02Zm14H9LpZ6rEWinpmR8LEblIHpL6HXAfMdbNeT7RgOQ3sSJjETeqA1M1ST7VQ1IkdXk/dpA5I3Sz1TAtFHezwuvW7KQUtFHU7nmpQcNPrgF7fg9cBvW6WeqCFok7tSJjATeqA1M1SJ1oo6na8EUaJm9QBqZulTrVQ1O14R4BSN6kDUjdL/UYo6na8I0BDN6kDUjdLPdJCUbfjHQEauUkdkLpR6uGNUNTteFodhm5SB6Ru9jtiN4LlAHZ8M49aTx20UNSJHdTBTa8Den0PXgf0uqHd9GtfxxvxKOPG1/Hu0QDABhifJ+/4Tw+jn+Iz4iZ1QOpmqd8IloeWeD10kzogdaPU73i/evRet//96rupg6PU+00z+PNhWDHeDeANK6/r25Lt/vxRo4ak085/WVfLZi0aeSJkXxXD6UbO5lWzRw67ysyu+/W6q9R6ef1uxYeqWU5t+X3f9hNb82Hb96KQojyanNtPO3nt9ZO2FPXEln7AB6GWXk/wGr+UQh2Gy/OqFtNbes8LWbXNxBb+/vN6D66ux4iprPrdm+/7qpxYq4/WfDm163qCUfbm49uTCbp7F2QQ0GAqay54sRJ+KRZ8U8tixXs5zH4Z2jEs/FNpT9PyyQwpF9M7dp+JZTWo2WyCo9mRmsUvR7jmv/4GyMfqZFivAAA=" # pragma: allowlist secret
# # }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair a bit of logic here and some from utils package. We should test this when we have more time a bit, down track. I can help with this. No action for now; early day.

Copy link
Member

@reisingerf reisingerf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

this,
'icav2_copy_batch_state_machine',
props.icav2_copy_batch_utility_state_machine_name
);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We learn...

I can throw another option into the mix: create your own (internal) ICAv2CopyBatchUtility exclusively for this stack's purpose. No dependency on existing/external stacks.

@alexiswl alexiswl force-pushed the feature/bssh-icav2-fastq-copy-manager branch from a3df96d to 8af8b68 Compare May 1, 2024 02:41
@alexiswl
Copy link
Member Author

alexiswl commented May 1, 2024

Can we have one last review of this before there's another refactor on the main branch pls?

@alexiswl alexiswl linked an issue May 1, 2024 that may be closed by this pull request
@victorskl
Copy link
Member

sure, going through...

Copy link
Member

@victorskl victorskl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good, Alexis. Merge as you see fit. Thanks.

@alexiswl
Copy link
Member Author

alexiswl commented May 1, 2024

Thanks! One last synth check

@alexiswl alexiswl merged commit 06fce2d into main May 1, 2024
2 checks passed
@alexiswl alexiswl deleted the feature/bssh-icav2-fastq-copy-manager branch May 1, 2024 03:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Use camel case for construct and class attributes Implement Copy Fastq Outputs from BSSH Project Manager
3 participants